Determining the Nearest Record to a Given Location

  • Good Morning all.

    My problem is that I have a table containing over a hundred thousand records. The table has the following columns: ID, X, Y, Latitude, Longitude.

    What I need to be able to do is to be able to locate the record which is closest to a given location.

    For example, if I have a location with a Latitude of 30.0045 and a Longitude of -93.8824, what would be the best way to query the table and return the record which is closest to the given location?

    Thanks in advance for any help.

    (I apologise if this thread is not in the right forum.)

  • drjdewhurst (11/27/2008)


    Good Morning all.

    My problem is that I have a table containing over a hundred thousand records. The table has the following columns: ID, X, Y, Latitude, Longitude.

    What I need to be able to do is to be able to locate the record which is closest to a given location.

    For example, if I have a location with a Latitude of 30.0045 and a Longitude of -93.8824, what would be the best way to query the table and return the record which is closest to the given location?

    Thanks in advance for any help.

    (I apologise if this thread is not in the right forum.)

    what value you will take to calculate the closest latitude or longitude. for ex: 30.0045 , the value near to it may be 30.0045 * 5, 30.0045 * 6 etc.

    Decide that value and based on it apply the filter inside the query.

    kshitij kumar
    kshitij@krayknot.com
    www.krayknot.com

  • Hello

    A quick Google confirms that this requirement is fairly common nowadays. The following thread describes a couple of sprocs which could be used to help, with a minimum of fuss.

    http://www.tek-tips.com/viewthread.cfm?qid=1499975&page=11

    This is assuming that you want to find the actual distance between your lat/long pairs.

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Is upgrading to SQL 2008 an option? There are geospatial data types built in and solving your problem there would be trivial.

    pseudo-code:

    DECLARE @g geography;

    SET @g = geography::STPointFromText('POINT(30.0045, -93.8824)', 4326);

    SELECT MIN(GeographyColumn.STDistance (@g) FROM SomeTable

    Rolling your own geography functions is not a simple problem.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for your ideas guys.

    In the end I simply ordered the table by the combined differences between the target latitude and longitude and the latitude and longitudes in the table, as follows:

    DECLARE @LATITUDE AS FLOAT

    DECLARE @LONGITUDE AS FLOAT

    SET @LATITUDE = 30.0045

    SET @LONGITUDE = -93.8824

    SELECT TOP 1 X, Y, Latitude, Longitude

    FROM Table1

    ORDER BY (ABS(ABS(LATITUDE)-ABS(@LATITUDE)))+ABS(ABS(LONGITUDE)-ABS(@LONGITUDE)))

  • I'm not sure that works properly.

    DECLARE @Table1 (X int, Y int, Latitude float, Longitude float)

    insert into @Table1 values (1,1, 90,0) -- the north pole

    insert into @Table1 values (2,2, 52,0) -- somewhere in the UK

    insert into @Table1 values (3,3, -30,25) -- somewhere in South Africa

    insert into @Table1 values (4,4, -35,-57) -- Near Buenos Aires

    DECLARE @LATITUDE AS FLOAT

    DECLARE @LONGITUDE AS FLOAT

    SET @LATITUDE = -90

    SET @LONGITUDE = 0

    -- what's the closest spot to the South Pole

    SELECT TOP 1 X, Y, Latitude, Longitude

    FROM @Table1

    ORDER BY (ABS(ABS(LATITUDE)-ABS(@LATITUDE)))+ABS(ABS(LONGITUDE)-ABS(@LONGITUDE)))

    -- Answer: Latitude 90, Longitude 0 (The north pole)

    SET @LATITUDE = -5

    SET @LONGITUDE = -35

    -- how about the east-most edge of South America?

    -- Answer: Middle of South Africa.

    This works 'better' but will give problems with longitude that's around +-180 and it's still not accurate. It returns that 25 East, 30 South is closer to the south pole than 35 west, 57 south, which is obviously not true.

    ORDER BY (ABS(LATITUDE-@LATITUDE)+ABS(LONGITUDE-@LONGITUDE))

    If you're just working with 'small' areas, like continental USA, then this will work. If you need answers accurate across the world, it's not going to be work well because you're using calculations that are semi-accurate on a flat plain to calculate distances over a globe.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I am only working with the Continental USA, and only a small portion of that.

    Thanks for your help.

  • drjdewhurst (11/27/2008)


    Thanks for your ideas guys.

    In the end I simply ordered the table by the combined differences between the target latitude and longitude and the latitude and longitudes in the table, as follows:

    DECLARE @LATITUDE AS FLOAT

    DECLARE @LONGITUDE AS FLOAT

    SET @LATITUDE = 30.0045

    SET @LONGITUDE = -93.8824

    SELECT TOP 1 X, Y, Latitude, Longitude

    FROM Table1

    ORDER BY (ABS(ABS(LATITUDE)-ABS(@LATITUDE)))+ABS(ABS(LONGITUDE)-ABS(@LONGITUDE)))

    I think that you will find that this is not only orders of magnitude faster, but gives the correct results also:

    CREATE INDEX dbo.Table1_LngLat on dbo.Table1 (Longitude,Latitude)

    go

    DECLARE @LATITUDE AS FLOAT

    DECLARE @LONGITUDE AS FLOAT

    SET @LATITUDE = 30.0045

    SET @LONGITUDE = -93.8824

    SELECT TOP 1 X, Y, Latitude, Longitude

    FROM (Select X,Y,Latitude,Longitude

    From Table1

    Where Longitude between @Longitude-.1 and @Longitude+.1

    And Latitude between @Latitude-.1 and @Latitude+.1) A

    ORDER BY acos(sin(@Latitude/57.295779513082323)

    * sin(Latitude/57.295779513082323)

    + cos(@Latitude/57.295779513082323)

    * cos(Latitude/57.295779513082323)

    * cos(Longitude/57.295779513082323 - @Longitude/57.295779513082323))

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks rbarryyoung , that seems to work really well.

    For the sake of my own understanding, can I just ask what the SIN(@Latitude/57.295779513082323) operators are doing in the query, i.e. what is the theory behind them?

    Thanks

  • Well, I snatched them from a post at the site that Chris Morris linked too (had to fix a bug in it though), which had no real explanation of it, but I assume that the "/57.295..." stuff is applying factors that combines PI, the circumference of the earth, the ratio of degrees to radians and the conversion to miles.

    The rest is "just" spherical trigonometric reduction. Well, so I assume, I tested some values and they came out right, so I didn't bother to check it analytically as derivations like this can be quite difficult and it's been a long time since college...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • By the way, for the most part, it isn't the calculation that gives that speedup, it just makes it accurate.

    What makes it faster is the inner subquery that implements a SARG-able "Box" algorithim. This box algorithim is the key technique for various "distance" searches (and there are many beyond physical distance) tractable. In this case the box is about +/-6 miles by +/-7 miles, so this will work so long as the nearest location is never greater than about 6 miles. If that is not always true, then there are tests and other tricks that can be added to deal with it.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Well, I snatched them from a post at the site that Chris Morris linked too (had to fix a bug in it though), which had no real explanation of it, but I assume that the "/57.295..." stuff is applying factors that combines PI, the circumference of the earth, the ratio of degrees to radians and the conversion to miles.

    The division by "/57.295..." stuff is simply the conversion from degrees to radians. The TSQL trigonometric functions use radians for angles, like all programming languages I know.

    360 degrees = 2 pi radians, so to convert to radians, divide by 180.0 and multiply by pi

    SELECT 180.0 / PI()

  • Good catch. But that leaves me wondering how it converts the results from Great Circle Radians to Miles?

    Anyone know of a Lat/Lng distance calculator that I can use to check this formula? (Google Maps is not so great for this).

    EDIT: (never mind, I just Googled a bunch)

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • OK, well I just tested it with a good Lat/Lng calculator and it is accurate in Miles for about 3.5 decimal places.

    I have no idea how it accomplishes this magic.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Correction: 4.5 decimal places accuracy.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply