May 26, 2011 at 7:59 am
I have a table with an ID (identity), State, City, Address, GeoRef(Geography), and I want to compute the distance between any two points with the same State/City and different address. I can compute the distance between any two single points using STDistance, it's getting the matrix of all the distances between all the points in the same city that's making me crazy.
I know there's a way to do it with a CTE and probably OVER but I can't seem to wrap my mind around it.
Any help would be greatly appreciated.
May 26, 2011 at 8:15 am
The solution is simple, but you are going to generate a huge number of combinations. (With a small town of only 10,000 properties, you have almost 100 million combinations to calculate distance for.) All you need is CROSS JOIN, which is the most ancient and honorable technique for generating combinations. Without writing the query for you, here is how you go about it:
CROSS JOIN your property table to itself.
- Consider aliasing the primary and joined instances of the table as P1 and P2 to keep things straight.
- If you don't want to see combinations that produce a distance of zero (a property's distance from itself), handle it in your WHERE clause.
The cross join will produce the set of all combinations of property pairs. Use the STDistance() function to compute the distances between P1.geolocation and P2.geolocation.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
May 26, 2011 at 9:12 am
Simpler than I thought. I generally don't consider cross join because of the volume ramifications but the entire table is 36 rows and the matching entries by state/city are in groups of 2 or 3 so I should be able to limit the output.
Thanks. I think this will get me going in the right direction.
May 26, 2011 at 10:19 am
Worked perfectly. Thanks.
May 26, 2011 at 2:23 pm
Distance on a plane, or distance on a sphere? If locations are far-ish apart, it makes a difference.
May 27, 2011 at 5:16 am
Mr Celko,
I have great respect for your knowledge in relational theory, I even own two of your books. However, since database and query design are not my primary responsibility so I don't spend a lot of time following the ever-changing standards for posting, and various ISO standards for naming. Our Enterprise Data Management group has their own naming standards and we abide by those. They may not follow the ISO standards, but that's not my call. Having been a member here for a number of years I've seen a number of different suggested methods of posting. I would have posted the actual DDL of the table but I thought (and it seems rightly so) that the problem was generic enough that just posting the description of the data table (which I inherited from another developer to resolve only this problem) would be sufficient. "Dixie" was able to point me in the right direction fairly quickly so it seems that your snarky response:
Learn to use ISO-11179 rules for the data element names, avoid needless dialect and use ISO-8601 temporal formats, codes and so forth. People cannot read your mind, so post your code and clear specs if you really want help.
was a bit overboard.
Had I designed the table I would have made the primary key State, City, Street_Address. I didn't, so I was working with what I was given. I added one field (GeoRef GEOGRAPHY in MS SQL 2008 spatial data type terminology). I'm not in a position to modify the primary table schema, I was only tasked with developing the query to determine the distances. The solution proposed by Dixie was sufficient to get me pointed in the right direction and I was able to construct a query that produced the desired results. Since there was a small number of data points the "bad side" of CROSS JOIN -- the potential number of rows in the result set, was immaterial.
Here's what I ended up with:
select l1.St, l1.City, l1.[Address] as Address1, l2.[Address] as Address2,
l1.GeoRef.STDistance(l2.GeoRef) as Meters
from dbo.LocGeo l1 cross join
dbo.LocGeo l2
where l1.St=l2.St and l1.City=l2.City and l1.address<>l2.address
I'm sure that with your significant expertise you'll probably have some editorial comment on the above query, however it works and I got the results I needed.
Thanks for your insight, no thanks for your attitude. We're not all Joe Celko.
May 27, 2011 at 5:18 am
sherifffruitfly (5/26/2011)
Distance on a plane, or distance on a sphere? If locations are far-ish apart, it makes a difference.
I did consider this but given that the locations are all within the same State/City combination for the purpose of this query so I don't think plane or sphere is relevant. The largest distance between points was 35 miles so if it was off by a few feet one way or the other it wouldn't matter.
May 27, 2011 at 8:51 am
If using the STDistance() on a geography type, distance is understood to be calculated over a curved surface. Some SRIDs even take into account that the earth isn't really a perfect globe, but more of an oblate spheroid. What is most important is that longitude coordinates get closer together as you get further from the equator and closer to the poles. But I agree that on a small scale the difference may be insignificant.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply