April 24, 2015 at 10:11 am
Sorry to post it here, but have spent too many hours so far.
Using LEAD and LAG I was able to shrink my data set to the example below, but cannot progress from there, so any help will be much appreciated, please.
I have a list of flights - an itinerary with a request to find a final destination. The steps to find that final are as following:
1. Find the longest route first (done that using ROW_NUMBER() )
2. check if previous or next segments have longer waiting time AND the mileage to that destination from the origin divided by longest mileage is no less than 75%
if yes - select the one with the longest waiting time.
The example below had originally multiple number of segments, reduced now to three:
- flight from Mumbai to Deli has a distance of 708 miles and the waiting time for the next flight is 6 hours.
- flight from Deli to Shanghai Pudong - a distance of 3174 miles between Mumbai - original destination and Shanghai, and the waiting time for the next flight is 5 hours.
- flight from Shanghai Pudong to Zhoushan Putuoshan airport - a distance from Mumbai to Zhoushan Putuoshan is 3171 and the waiting time is 18 hours.
The expected result is Zhoushan Putuoshan. although it is closer to Mumbai by several miles, it is still the final destination of the flight.
Thank you all in advance!
WITH flight_details AS
(
SELECT 7 AS rownumByHighestMileage,
'BOM' AS from_airport,
'DEL' AS to_airport,
'DEL' AS next_from_airport,
'PVG' AS next_to_airport,
708 AS mileage_from_origin,
6 AS waiting_time_in_hours
UNION ALL
SELECT 1 AS rownumByHighestMileage,
'DEL' AS from_airport,
'PVG' AS to_airport,
'PVG' AS next_from_airport,
'HSN' AS next_to_airport,
3174 AS mileage_from_origin,
5 AS waiting_time
UNION ALL
SELECT 5 AS rownumByHighestMileage,
'PVG' AS from_airport,
'HSN' AS to_airport,
NULL AS next_from_airport,
NULL AS next_to_airport,
3171 AS mileage_from_origin,
18 AS waiting_time
)
SELECT * FROM flight_details
April 24, 2015 at 10:15 am
We can help but first you have to help us. We need some information here to offer any assistance. Please take a few minutes and read the first link in my signature for best practices when posting questions.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 24, 2015 at 10:19 am
Sean Lange (4/24/2015)
We can help but first you have to help us. We need some information here to offer any assistance. Please take a few minutes and read the first link in my signature for best practices when posting questions.
Thank you for your comment. I didn't throw the data, but placed it as valid SELECT script. What value there would be of creating a table, as per your link?
Also, to my mind at least, I have clearly described the requirement - to find a final destination from the line above and below the one with longest mileage.
Can you advise what is missing then?
Kind Regards
April 24, 2015 at 10:26 am
BOR15K (4/24/2015)
Sean Lange (4/24/2015)
We can help but first you have to help us. We need some information here to offer any assistance. Please take a few minutes and read the first link in my signature for best practices when posting questions.Thank you for your comment. I didn't throw the data, but placed it as valid SELECT script. What value there would be of creating a table, as per your link?
Also, to my mind at least, I have clearly described the requirement - to find a final destination from the line above and below the one with longest mileage.
Can you advise what is missing then?
Kind Regards
Here is your explanation of the desired output:
- flight from Mumbai to Deli has a distance of 708 miles and the waiting time for the next flight is 6 hours.
- flight from Deli to Shanghai Pudong - a distance of 3174 miles between Mumbai - original destination and Shanghai, and the waiting time for the next flight is 5 hours.
- flight from Shanghai Pudong to Zhoushan Putuoshan airport - a distance from Mumbai to Zhoushan Putuoshan is 3171 and the waiting time is 18 hours.
The expected result is Zhoushan Putuoshan. although it is closer to Mumbai by several miles, it is still the final destination of the flight.
How do come up with that result and what is "Zhoushan Putuoshan"? Is that an airport? How do we know where that value comes from? All we have is some airport codes and I am not familiar with them enough to know which one is which.
The biggest challenge here is that it is not clear to me at all what you are trying to do.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 24, 2015 at 3:24 pm
Sean Lange (4/24/2015)
BOR15K (4/24/2015)
Sean Lange (4/24/2015)
We can help but first you have to help us. We need some information here to offer any assistance. Please take a few minutes and read the first link in my signature for best practices when posting questions.Thank you for your comment. I didn't throw the data, but placed it as valid SELECT script. What value there would be of creating a table, as per your link?
Also, to my mind at least, I have clearly described the requirement - to find a final destination from the line above and below the one with longest mileage.
Can you advise what is missing then?
Kind Regards
Here is your explanation of the desired output:
- flight from Mumbai to Deli has a distance of 708 miles and the waiting time for the next flight is 6 hours.
- flight from Deli to Shanghai Pudong - a distance of 3174 miles between Mumbai - original destination and Shanghai, and the waiting time for the next flight is 5 hours.
- flight from Shanghai Pudong to Zhoushan Putuoshan airport - a distance from Mumbai to Zhoushan Putuoshan is 3171 and the waiting time is 18 hours.
The expected result is Zhoushan Putuoshan. although it is closer to Mumbai by several miles, it is still the final destination of the flight.
How do come up with that result and what is "Zhoushan Putuoshan"? Is that an airport? How do we know where that value comes from? All we have is some airport codes and I am not familiar with them enough to know which one is which.
The biggest challenge here is that it is not clear to me at all what you are trying to do.
I understand now. I do know how to calculate the values, so there is no need to focus on them.
With regards to the codes and names, I assumed they are self-explanatory and it will be easier to read the names, so apologies:
BOM = Mumbai
DEL = Deli
PVG = Shanghai Pudong
HSN = Zhoushan Putuoshan
I believe it will make sense now. Again, there is no need to calculate the distances - they have been already calculated by me.
April 24, 2015 at 3:33 pm
BOR15K (4/24/2015)
Sean Lange (4/24/2015)
BOR15K (4/24/2015)
Sean Lange (4/24/2015)
We can help but first you have to help us. We need some information here to offer any assistance. Please take a few minutes and read the first link in my signature for best practices when posting questions.Thank you for your comment. I didn't throw the data, but placed it as valid SELECT script. What value there would be of creating a table, as per your link?
Also, to my mind at least, I have clearly described the requirement - to find a final destination from the line above and below the one with longest mileage.
Can you advise what is missing then?
Kind Regards
Here is your explanation of the desired output:
- flight from Mumbai to Deli has a distance of 708 miles and the waiting time for the next flight is 6 hours.
- flight from Deli to Shanghai Pudong - a distance of 3174 miles between Mumbai - original destination and Shanghai, and the waiting time for the next flight is 5 hours.
- flight from Shanghai Pudong to Zhoushan Putuoshan airport - a distance from Mumbai to Zhoushan Putuoshan is 3171 and the waiting time is 18 hours.
The expected result is Zhoushan Putuoshan. although it is closer to Mumbai by several miles, it is still the final destination of the flight.
How do come up with that result and what is "Zhoushan Putuoshan"? Is that an airport? How do we know where that value comes from? All we have is some airport codes and I am not familiar with them enough to know which one is which.
The biggest challenge here is that it is not clear to me at all what you are trying to do.
I understand now. I do know how to calculate the values, so there is no need to focus on them.
With regards to the codes and names, I assumed they are self-explanatory and it will be easier to read the names, so apologies:
BOM = Mumbai
DEL = Deli
PVG = Shanghai Pudong
HSN = Zhoushan Putuoshan
I believe it will make sense now. Again, there is no need to calculate the distances - they have been already calculated by me.
Without the DDL for the table(s) and sample data, how do you expect us to help with the code you need? We can't see what you unless you give us the DDL, sample data, and expected results. Plus, for the effort you put in to post this information, you will get back tested code.
April 24, 2015 at 3:50 pm
Without the DDL for the table(s) and sample data, how do you expect us to help with the code you need? We can't see what you unless you give us the DDL, sample data, and expected results. Plus, for the effort you put in to post this information, you will get back tested code.
I see you haven't ran my script, otherwise would get a working result.
Since when people required to publish DDL scripts here? Should I go through and show multiple examples of WITH AS,
which returned the list of data and it sufficed? A simple SELECT * FROM flight_details of mine WITH flight_details AS () returns a valid result.
You don't want to help? That is OK, you don't have to, but please do not start throw things over one which you don't do over another.
Expected results? I am sorry, but I have clearly stated what is the result I want as well - the last line, with a mileage of 3171.
April 24, 2015 at 9:42 pm
Nevermind
__________________________________________________________________________________________________________
How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/
April 25, 2015 at 8:06 am
BOR15K (4/24/2015)
Without the DDL for the table(s) and sample data, how do you expect us to help with the code you need? We can't see what you unless you give us the DDL, sample data, and expected results. Plus, for the effort you put in to post this information, you will get back tested code.
I see you haven't ran my script, otherwise would get a working result.
Since when people required to publish DDL scripts here? Should I go through and show multiple examples of WITH AS,
which returned the list of data and it sufficed? A simple SELECT * FROM flight_details of mine WITH flight_details AS () returns a valid result.
You don't want to help? That is OK, you don't have to, but please do not start throw things over one which you don't do over another.
Expected results? I am sorry, but I have clearly stated what is the result I want as well - the last line, with a mileage of 3171.
Sean and Lynn, along with almost everyone else, requests DDL, sample data and expected results. This is nothing new. We need to know what you have, what you want and the rules for getting to what you want. We can then figure out the best route for getting from what you have to what you want. Without knowing these three basic pieces of information, there's really no way to provide concrete help. Without this information, anything we do manage to give you will be completely untested, as we have nothing to test against. I've seen people take stabs in the dark before with very limited information. Sometimes, they come up with the answer the OP was looking for, but they're guesses. To provide anything more than a guess, we need the basics.
For example, you state above that you want the last row with the mileage of 3171. Okay, so select your data using whatever order by clause you feel is appropriate and then select the appropriate row from the set. But that's not the answer you're after. In order to give you the answer I THINK you want, I would need to create a table, populate it with some data and then write queries against it to see if I can figure out how to produce your desired results. Without the basics, there's no foundation built on which to work. There's nowhere to start.
And pulling an attitude with good people like Sean and Lynn is certainly no way to solicit help.
Instead, you'll probably want to start here: https://msdn.microsoft.com/en-us/library/hh231256%28v=sql.110%29.aspx
Also, if you don't have it, I would recommend Itzik's book on window functions: http://www.amazon.com/Microsoft-High-Performance-Functions-Developer-Reference/dp/0735658366
April 26, 2015 at 6:27 am
I really am not sure if I understand the business rules....we have only been presented with three rows.
am also concerned about the comment re "previous or next segments".....how are these determined?
in an effort to help the OP move forward.....I will throw the following into the mix and ask OP to determine if the result set is correct, and if not then why not.....maybe then better solutions will come forward.
WITH flight_details AS
(
SELECT
'scenario1' as example,
7 AS rownumByHighestMileage,
'BOM' AS from_airport,
'DEL' AS to_airport,
'DEL' AS next_from_airport,
'PVG' AS next_to_airport,
708 AS mileage_from_origin,
6 AS waiting_time_in_hours
UNION ALL
SELECT
'scenario1' as example,
1 AS rownumByHighestMileage,
'DEL' AS from_airport,
'PVG' AS to_airport,
'PVG' AS next_from_airport,
'HSN' AS next_to_airport,
3174 AS mileage_from_origin,
5 AS waiting_time
UNION ALL
SELECT
'scenario1' as example,
5 AS rownumByHighestMileage,
'PVG' AS from_airport,
'HSN' AS to_airport,
NULL AS next_from_airport,
NULL AS next_to_airport,
3171 AS mileage_from_origin,
18 AS waiting_time
UNION ALL
SELECT
'scenario2' as example,
7 AS rownumByHighestMileage,
'BOM' AS from_airport,
'DEL' AS to_airport,
'DEL' AS next_from_airport,
'PVG' AS next_to_airport,
2411 AS mileage_from_origin,
20 AS waiting_time_in_hours
UNION ALL
SELECT
'scenario2' as example,
1 AS rownumByHighestMileage,
'DEL' AS from_airport,
'PVG' AS to_airport,
'PVG' AS next_from_airport,
'HSN' AS next_to_airport,
3200 AS mileage_from_origin,
5 AS waiting_time
UNION ALL
SELECT
'scenario2' as example,
5 AS rownumByHighestMileage,
'PVG' AS from_airport,
'HSN' AS to_airport,
NULL AS next_from_airport,
NULL AS next_to_airport,
3171 AS mileage_from_origin,
12 AS waiting_time
UNION ALL
SELECT
'scenario3' as example,
7 AS rownumByHighestMileage,
'BOM' AS from_airport,
'DEL' AS to_airport,
'DEL' AS next_from_airport,
'PVG' AS next_to_airport,
1700 AS mileage_from_origin,
5 AS waiting_time_in_hours
UNION ALL
SELECT
'scenario3' as example,
1 AS rownumByHighestMileage,
'DEL' AS from_airport,
'PVG' AS to_airport,
'PVG' AS next_from_airport,
'HSN' AS next_to_airport,
2000 AS mileage_from_origin,
15 AS waiting_time
UNION ALL
SELECT
'scenario3' as example,
5 AS rownumByHighestMileage,
'PVG' AS from_airport,
'HSN' AS to_airport,
NULL AS next_from_airport,
NULL AS next_to_airport,
1900 AS mileage_from_origin,
8 AS waiting_time
)
--SELECT * FROM flight_details
, cte2 as (
SELECT
example
, rownumByHighestMileage
, from_airport
, to_airport
, next_from_airport
, next_to_airport
, mileage_from_origin
, waiting_time_in_hours
, mileage_from_origin * 1.00 / MAX(mileage_from_origin) OVER (PARTITION BY example) AS mm
FROM flight_details AS fd
)
, cte3 as (
SELECT
example
, rownumByHighestMileage
, from_airport
, to_airport
, next_from_airport
, next_to_airport
, mileage_from_origin
, waiting_time_in_hours
, mm
, ROW_NUMBER() OVER (PARTITION BY example ORDER BY waiting_time_in_hours DESC) AS rn
FROM cte2
WHERE mm >= 0.75
)
SELECT
example
, rownumByHighestMileage
, from_airport
, to_airport
, next_from_airport
, next_to_airport
, mileage_from_origin
, waiting_time_in_hours
FROM cte3
WHERE (rn = 1);
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
April 27, 2015 at 3:53 am
Thank you for the reply.
I have only posted 3 lines, as for each itinerary I am able to eliminate the rest of the segments and to remain with 3 only:
the furthermost destination and two other segments: one before and one after. Additional example to clarify, if there is an itinerary of four
segments:
Vladivostok to Moscow
Moscow to Paris
Paris to London
London to Amsterdam
From here I know to extract three relevant lines (I omit SQL script on purpose, as I know how to do it):
1. First I get London as furthest from Vladivostok, based on coordinates calculation (latest SQL all come with nice GEOGRAPHY:: feature)
2. I know to look into the previous (Paris) and next (Amsterdam) lines, using LEAD and LAG
What I don't know to implement in one SELECT, but implemented into two different SELECT's is what will be the major destination in this case .
Thank you again for the answer.
April 27, 2015 at 7:01 am
Trying to restate the problem
with LongestSegments (itinerary_id, segment_id, mileage_from_origin, waiting_time) as (
-- Get the 'longest' flight (AKA segment) for every itinerary.
-- No problems here.
-- Sample data
SELECT
100 as itinerary_id,
10001 as segment_id,
3174. AS mileage_from_origin,
5. AS waiting_time
), LongestSegmentsNeighbors (itinerary_id, segment_id, mileage_from_origin, waiting_time) as (
-- Get up to 2 segments, the previous one if any and the next one if any for the longest segment of the itinerary.
-- No problems here.
-- Sample data
SELECT
100 as itinerary_id,
10020 as segment_id,
708. AS mileage_from_origin,
6. AS waiting_time_in_hours
UNION ALL
SELECT
100 as itinerary_id,
10030 as segment_id,
3171. AS mileage_from_origin,
18. AS waiting_time
)
-- how to choose one of the neighbors if any exists base on criteria ?
SELECT ls.itinerary_id
, ls.segment_id AS longest_segment_id
, ls.mileage_from_origin AS longest_mileage_from_origin
, ls.waiting_time AS longest_waiting_time
, ln.segment_id AS ln_id
, ln.mileage_from_origin AS ln_mileage_from_origin
, ln.waiting_time AS ln_waiting_time
FROM LongestSegments ls
OUTER APPLY (
SELECT TOP(1) segment_id, mileage_from_origin, waiting_time
FROM LongestSegmentsNeighbors ln
WHERE ln.itinerary_id = ls.itinerary_id
AND ln.mileage_from_origin/ls.mileage_from_origin > 0.75
AND ln.waiting_time > ls.waiting_time
ORDER BY waiting_time DESC) ln
April 27, 2015 at 7:48 am
BOR15K (4/24/2015)
...I have a list of flights - an itinerary with a request to find a final destination. The steps to find that final are as following:1. Find the longest route first (done that using ROW_NUMBER() )
2. check if previous or next segments have longer waiting time AND the mileage to that destination from the origin divided by longest mileage is no less than 75%
if yes - select the one with the longest waiting time.
...
I'm sorry but I don't see any relationship between the requirement - identifying the final destination - and the suggested steps. It's nonsense. You should be using a standardised datetime and the origin and destination of each leg compared to the others in the itinerary set. You will have to account for stopovers, legs travelled by means other than air, and itineraries which have repeats of the same leg, e.g.
BOM -> DEL
DEL -> PVG
PVG -> DEL
DEL-> PVG
PVG -> HSN
You cannot determine the final destination using waiting time and distance from origin alone. It won't work.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
April 27, 2015 at 8:00 am
Thank you Chris,
I have implemented that already, based on user's requirements to apply only the described approach.
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply