Optimize query for finding nearest airport based on latitude and longitude using STDistance

  • Hello guys.

    I try to somehow copy what dwain.c did in his query. I was just wondering if there's a possibility that the orig lat and long would be the same as the destination lat and long. Like the airport they flew from would also be their destination.

    I have my code as this: I just wanted to ask if the code for getting the @StartLat, @StartLong, @EndLat, and @EndLong would work?

    DECLARE @StartLat DECIMAL(8, 5)

    , @EndLat DECIMAL(8, 5)

    , @StartLong DECIMAL(8, 5)

    , @EndLong DECIMAL(8, 5)

    , @LatDiff DECIMAL(8, 5)

    , @LongDiffDECIMAL(8, 5);

    SELECT TOP(1) @StartLat = Latitude, @StartLong = Longitude

    FROMdbo.Log_Details

    ORDER BY Latitude, Longitude;

    SELECT TOP(1) @EndLat = Latitude, @EndLong = Longitude

    FROMdbo.Log_Details;

    ORDER BY Latitude DESC, Longitude DESC;

    SELECT @LatDiff= ABS(@StartLat - @EndLat), @LongDiff= ABS(@StartLong - @EndLong);

    SELECT Log_Detail_ID, Latitude, Longitude, Airport_Latitude, Airport_Longitude, Distance

    FROMdbo.Log_Details

    CROSS APPLY(SELECT TOP(1) *, Distance = Airport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326))

    FROMdbo.Airports WHERE Airport_Latitude BETWEEN Latitude - @LatDiff AND Latitude + @LatDiff

    AND Airport_Longitude BETWEEN Longitude - @LongDiff AND Longitude + @LongDiff

    ORDER BY Airport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326))) AS a;

    Thank you.

  • facturan.antonio (4/30/2012)


    Hi guys.

    I'm really grateful for all the help you've shared. Analyzing the formulas takes a lot of time for me to somehow grasp what does it do. I'm not really good in mathematics 🙂 I'm still trying to understand what do these functions do. Guess have to read more about this. Anyway, earlier today, I've tried The Dixie Flatline's post but somehow it takes so long, maybe it's because I don't have an index in place in the Airports table for Lat and Long fields. I will try dwain.c's post and hopefully it work for me. As for Jeff's suggestion for trans-Pacific flights :-), I still have to read about this as I don't have any clue.

    Thanks you guys.

    Unless you have an index in place, and the query uses it, spatial queries can run insanely long.

    Also, you really cant ignore the great circle routes, when calculating distances between airports. You can only use "flat" distance calculations when the scale is small enough that any difference is insignificant.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Just another view using SQL 2008 R2. This is based on a piece that I did for a customer but scaled to airports and flights. The customer wanted to fill in the path (similar to your flights) and customer locations (similar to your airports).

    Below are three examples given:

    1) a list of way points (flight record points) and a requirement to return all @returnRows points/airports within @maxDistance from each point

    2) a linestring (flight path) and a requirement to return all @returnRows points/airports within @maxDistance from each point

    3) a startpoint and endpoint, a number of steps (calculated) for a geometric straight line between the points, to return the nearest point within @maxDistance

    Built on the same ideas as Jeff and Dwain, but changing the circles into a STBuffer around the path to restrict the airports to be examined.

    Notes:

    Examples 1 and 2 involve a flight from London Heathrow to Bordeaux bypassing the crowded airspace over northern France

    Example 3 uses the same Cleveland to St Louis as per one of the other examples (and results in Cleveland,Griffing Sandusky,Galion,Findlay,Allen County,Neil Armstrong (Wapakoneta), Decatur, Delaware County, Marion, Indianapolis International, Vermillion County, Coles County Memorial, Vandalia, Municipal, Alton and St Louis International)

    Performance-wise : may not be the best ever, but restricting the STBuffer(@maxDistance) helps.

    /********************************************************************

    Nearest Airports

    ********************************************************************/

    -- drop database Flights

    create database Flights;

    go

    use Flights;

    go

    -- Base tables

    -- Airports

    create table dbo.Airports(

    AirportCode char(3) primary key,

    AirportName varchar(255),

    CityName varchar(255),

    Country varchar(255),

    CountryCode char(3),

    Latitude decimal(10,7),

    Longitude decimal(10,7),

    Location Geography

    )

    CREATE SPATIAL INDEX Airport_Index

    ON dbo.Airports(Location)

    USING GEOGRAPHY_GRID

    WITH (

    GRIDS = (MEDIUM, HIGH, MEDIUM, LOW ),

    CELLS_PER_OBJECT = 64,

    PAD_INDEX = ON );

    -- insert into dbo.Airports select * from FlightGeo.dbo.Airports (internet sourced 9317 worldwide airport locations)

    -- Flight path (POINTS)

    create table dbo.FlightPath(

    PathID int identity(1,1) primary key,

    FlightPoint geography

    )

    CREATE SPATIAL INDEX FlightPath_Index

    ON dbo.FlightPath(FlightPoint)

    USING GEOGRAPHY_GRID

    WITH (

    GRIDS = (MEDIUM, HIGH, MEDIUM, LOW ),

    CELLS_PER_OBJECT = 64,

    PAD_INDEX = ON );

    -- Tally table

    create table dbo.Tally(

    T int primary key

    )

    go

    ;WITH Nbrs_3(n) AS (SELECT 1 UNION SELECT 0)

    ,Nbrs_2 (n) AS (SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2)

    ,Nbrs_1 (n) AS (SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2)

    ,Nbrs_0 (n) AS (SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2)

    ,Nbrs (n) AS (SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2)

    ,TallyNow (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY n) As n FROM Nbrs)

    insert into dbo.Tally select * from TallyNow

    ---------------------------------------- sample flight London/Bordeaux (missing Northern France)

    -- Given flight path details (points)

    go

    truncate table dbo.FlightPath

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-0.4513890 51.4697220)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-1.3595581 51.2653521)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-1.7413330 50.8631777)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-2.6531982 50.2998670)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-2.9196166 49.7422316)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-3.2244873 49.1619507)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-3.8369750 48.6710126)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-4.1638183 47.9329065)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-4.1033935 47.2363545)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-3.6199951 46.6682870)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-3.1228637 45.9568782)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-2.0462036 45.1781648)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-0.6756591 44.8305521)',4326))

    insert into dbo.FlightPath values (geography::STGeomFromText('POINT(-0.7152780 44.8286110)',4326))

    set statistics time on;

    declare @returnRows int = 1

    declare @maxDistance int = 100000;

    with Distances as (

    select *,

    AP.Location.STDistance(FP.FlightPoint) as Distance,

    ROW_NUMBER() over (partition by FP.PathID order by AP.Location.STDistance(FP.FlightPoint) asc) as Priority

    from dbo.FlightPath as FP

    left join (select dbo.Airports.* from dbo.Airports, dbo.FlightPath

    where Location.STIntersects(FlightPoint.STBuffer(@maxDistance))=1) as AP

    on AP.Location.STDistance(FP.FlightPoint)<@maxDistance

    )

    select PathID, AirportName, CityName, Country,Location, Distance

    from Distances

    where Priority = 1

    set statistics time off

    go

    -- Given flight path details (linestring)

    set statistics time on

    declare @returnRows int = 1

    declare @maxDistance int = 100000;

    declare @flightpath geography

    set @flightpath = geography::STGeomFromText(

    'LINESTRING(-0.4513890 51.4697220,

    -1.3595581 51.2653521, -1.7413330 50.8631777,

    -2.6531982 50.2998670, -2.9196166 49.7422316,

    -3.2244873 49.1619507, -3.8369750 48.6710126,

    -4.1638183 47.9329065, -4.1033935 47.2363545,

    -3.6199951 46.6682870, -3.1228637 45.9568782,

    -2.0462036 45.1781648, -0.6756591 44.8305521,

    -0.7152780 44.8286110)', 4326);

    with AirportsToBeConsidered as(

    select *

    from dbo.Airports

    where Location.STIntersects(@flightPath.STBuffer(@maxDistance))=1

    )

    select * from

    (select *,ROW_NUMBER() over

    (partition by T order by A.Location.STDistance(@flightPath.STPointN(T))) as Priority,

    A.Location.STDistance(@flightPath.STPointN(T)) as Distance

    from (select * from dbo.Tally where T <= @flightpath.STNumPoints()) as T

    left join AirportsToBeConsidered as A

    on A.Location.STDistance(@flightPath.STPointN(T))<@maxDistance

    ) as A

    where Priority<=@returnRows

    set statistics time off

    go

    ----Straight line between start and end points (Cleveland -> St Louis)

    declare @flightbit int = 0

    declare @flightbits int = 50

    declare @startpoint geography

    declare @endpoint geography

    select @startpoint = location

    from dbo.Airports where AirportCode='CLE'

    select @endpoint = location

    from dbo.Airports where AirportCode='STL'

    declare @distance decimal(10,2) = @StartPoint.STDistance(@EndPoint)

    declare @flightstart_LAT decimal(10,7)=@StartPoint.Lat

    declare @flightstart_LON decimal(10,7)=@StartPoint.Long

    declare @flightend_LAT decimal(10,7)=@EndPoint.Lat

    declare @flightend_LON decimal(10,7)=@EndPoint.Long

    truncate table dbo.FlightPath

    while @flightbit <= @flightbits

    begin

    insert into dbo.FlightPath(FlightPoint)

    select geography::STGeomFromText('POINT(' +

    convert(varchar(25),@Flightstart_LON-(

    (@Flightstart_LON-@flightend_LON)*(@flightbit*1.0/@flightbits)) )+' '+

    convert(varchar(25),@Flightstart_LAT-(

    (@Flightstart_LAT-@flightend_LAT)*(@flightbit*1.0/@flightbits)) )+')',4326)

    set @flightbit = @flightbit + 1

    end

    select * from dbo.FlightPath

    go

    set statistics time on;

    declare @maxDistance int = 350000;

    with Distances as (

    select *,

    AP.Location.STDistance(FP.FlightPoint) as Distance,

    ROW_NUMBER() over (partition by FP.PathID order by AP.Location.STDistance(FP.FlightPoint) asc) as Priority

    from dbo.FlightPath as FP

    left join (select dbo.Airports.* from dbo.Airports, dbo.FlightPath

    where Location.STIntersects(FlightPoint.STBuffer(@maxDistance))=1) as AP

    on AP.Location.STDistance(FP.FlightPoint)<@maxDistance

    )

    select PathID, AirportName, CityName, Country,Location, Distance

    from Distances

    where Priority = 1

    set statistics time off

    go

    Fitz

  • The Dixie Flatline (4/30/2012)


    facturan.antonio (4/30/2012)


    Hi guys.

    I'm really grateful for all the help you've shared. Analyzing the formulas takes a lot of time for me to somehow grasp what does it do. I'm not really good in mathematics 🙂 I'm still trying to understand what do these functions do. Guess have to read more about this. Anyway, earlier today, I've tried The Dixie Flatline's post but somehow it takes so long, maybe it's because I don't have an index in place in the Airports table for Lat and Long fields. I will try dwain.c's post and hopefully it work for me. As for Jeff's suggestion for trans-Pacific flights :-), I still have to read about this as I don't have any clue.

    Thanks you guys.

    Unless you have an index in place, and the query uses it, spatial queries can run insanely long.

    Also, you really cant ignore the great circle routes, when calculating distances between airports. You can only use "flat" distance calculations when the scale is small enough that any difference is insignificant.

    Thanks this got me thinking about the great circles and how we could split a line into even pieces given a geographic line from point to point without intermediate points.

    This code will split a great circle route into steps @waypointeveryXmiles apart. This can then be used in my last post to give the nearest airport to each of these points.

    Hope this is some use to somebody.

    declare @line geography

    declare @mtomiles int = 1609

    declare @waypointeveryXmiles int = 25

    -- London Heathrow to Seattle Tacoma

    select @line = Geography::STGeomFromText(

    'LINESTRING (-0.451389 51.469722,-122.309444 47.448889)',4326)

    select @line

    declare @waypoints table (

    ID Int identity(1,1) primary key,

    Point geography

    )

    -- add first point at start of line

    insert into @waypoints

    select @line.STStartPoint()

    -- add intermediate points along line

    insert into @waypoints

    select @line.STStartPoint().STBuffer(T * @waypointeveryXmiles * @mtomiles).STIntersection(@line).STEndPoint()

    from Tally

    where (@line.STLength()/@mtomiles) > (T * @waypointeveryXmiles)

    order by T

    -- add last point at end of line

    insert into @waypoints

    select @line.STEndPoint()

    select * from @waypoints

    Fitz

  • facturan.antonio (4/30/2012)


    Hello guys.

    I try to somehow copy what dwain.c did in his query. I was just wondering if there's a possibility that the orig lat and long would be the same as the destination lat and long. Like the airport they flew from would also be their destination.

    I have my code as this: I just wanted to ask if the code for getting the @StartLat, @StartLong, @EndLat, and @EndLong would work?

    DECLARE @StartLat DECIMAL(8, 5)

    , @EndLat DECIMAL(8, 5)

    , @StartLong DECIMAL(8, 5)

    , @EndLong DECIMAL(8, 5)

    , @LatDiff DECIMAL(8, 5)

    , @LongDiffDECIMAL(8, 5);

    SELECT TOP(1) @StartLat = Latitude, @StartLong = Longitude

    FROMdbo.Log_Details

    ORDER BY Latitude, Longitude;

    SELECT TOP(1) @EndLat = Latitude, @EndLong = Longitude

    FROMdbo.Log_Details;

    ORDER BY Latitude DESC, Longitude DESC;

    SELECT @LatDiff= ABS(@StartLat - @EndLat), @LongDiff= ABS(@StartLong - @EndLong);

    SELECT Log_Detail_ID, Latitude, Longitude, Airport_Latitude, Airport_Longitude, Distance

    FROMdbo.Log_Details

    CROSS APPLY(SELECT TOP(1) *, Distance = Airport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326))

    FROMdbo.Airports WHERE Airport_Latitude BETWEEN Latitude - @LatDiff AND Latitude + @LatDiff

    AND Airport_Longitude BETWEEN Longitude - @LongDiff AND Longitude + @LongDiff

    ORDER BY Airport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326))) AS a;

    Thank you.

    I'm thinking that it will not. You need a flight time in your log table and then you could use that.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Mark,

    It is always nice to see someone with obvious experience weigh in on a thread like this. Your initial algorithm looks pretty complex and I'll need to study it before I comment.

    The second one about identifying way points along the great circle routes... I thought about trying something like this. I did in fact after I posted my performance tome. Didn't have much success with it though.

    Thanks for your contribution to this very interesting thread!


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • 1) To the OP: if you aren't good with analysis and mathematics, why are you trying to do this VERY complicated stuff? And without regular computer access too?

    2) Isaac Kunen's powers-of-two stuff is likely the most efficient public method for finding nearest with spatial, but it doesn't compute nearest point to a line like it seems you need.

    3) Spatial stuff is VERY complicated, and indexing it properly is also complex. I did some neat work for a stealth startup company that was a quite a bit more efficient than Isaac's stuff but it doesn't do nearest to line either (assuming that is what you need). I believe it could be modified to suit, but this isn't forum-post stuff. If this is for a work product, drop me a line and we can talk consulting to get you what you need.

    4) As someone else mentioned, SQL Server 2012 will have a built-in nearest calculation (as well as some much-needed improvements to spatial indexing), but that nearest isn't as efficient as the stuff I came up with. It is a hell of a lot simpler however! :hehe:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • This post intentionally left blank.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Hello guys.

    Sorry, it took this long to reply to your posts. I have to do other database stuff.

    I try some of the solutions posted but my query is still slow even though I've already put an indexes on the affected tables.

    I ask for some suggestion from our senior project manager and he gave his opinion about the situation. His suggestion is just simple. That I didn't have to calculate the nearest airport point by point in the log, I just have to determine whether the speed of the airplane is zero and get from there the nearest airport based on the lat/long. I should have posted the whole table definition of the log as what Evil Kraig F suggested. It's my fault, I should have posted it so not to waste your time on trying to figure out what the best solution to the problem. Sorry, I should have been more clear. Lesson learned. I've learned a lot from all of your suggestions. This is the updated code I made.

    DECLARE @LogID INT;

    SET @LogID = 45;

    DECLARE @NearestAirports TABLE

    (Airport_IDINT

    , Airport_IdentVARCHAR(20)

    , Airport_Name VARCHAR(255)

    , Airport_Latitude DECIMAL(8, 5)

    , Airport_LongitudeDECIMAL(8, 5)

    , Airport_ElevationINT

    , Airport_GeographyGEOGRAPHY);

    INSERT INTO @NearestAirports

    SELECT *

    FROM dbo.Airports

    WHERE Airport_ID IN (

    SELECT Airport_ID

    FROM (SELECT Latitude, Longitude FROM dbo.Log_Details WHERE Log_Detail_Log_ID = @LogID AND Ground_Speed = 0) AS l

    CROSS APPLY(

    SELECT TOP(1) *

    FROM dbo.Airports

    WHERE Airport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326)) <= 5000

    ORDER BY Airport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326))) a);

    -- Distance in Nautical Miles

    DECLARE @NMRange MONEY;

    SET @NMRange = (1852. * 5); -- Set it to 5NM

    SELECT Log_Details.*

    FROM Log_Details

    CROSS APPLY

    (

    SELECTTOP(1) *

    FROM@NearestAirports

    -- Only limit the log record to within 5NM from the airport and plane altitude is within 20 ft

    WHEREAirport_Geography.STDistance(GEOGRAPHY::Point(Latitude, Longitude, 4326)) <= @NMRange

    AND Baro_Alt - Airport_Elevation BETWEEN -20 AND 20

    ) a

    WHERE Log_Detail_Log_ID = @LogID;

    Thank you. I'm really grateful.

  • Given that, I hope your boss doesn't have the misconception that finding the nearest airport to any given airport has anything to with finding the shortest path between airports.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply