July 18, 2017 at 1:49 pm
Hello,
I have a table of travel booking details (over several millions of records). Some of the records may have passenger's name, but some - may not, e.g. if there is an extra charge for a greeting upon arrival. I need to summarise the costs of the booking as well as to provide a first passenger's name, travelling under this booking. In the example below I need to show VERITY/SMITH MRS. When I use FIRST_VALUE it obviously shows THOMAS/SMITH MR. How can I achieve customer's request, please? I have seen some solutions, where I shall JOIN the table to itself, but I wonder if this is a right way to go with millions of records, even if there is a correct index in place.
Much appreciated!
IF OBJECT_ID('tempdb..##invoice') IS NOT NULL
DROP TABLE ##invoice;
WITH invoice AS(
SELECT 1 AS invoice_number,
'' AS passenger_name,
10 AS fare,
1 AS commission,
2 AS gst
UNION ALL
SELECT 1,
'VERITY/SMITH MRS',
100,
8,
20
UNION ALL
SELECT 1,
'THOMAS/SMITH MR',
98,
7,
10
UNION ALL
SELECT 1,
'',
4,
0,
0
)
SELECT *
INTO ##invoice
FROM invoice;
SELECT MIN(first_passenger) AS first_passenger,
SUM (fare) AS total_fare,
SUM(commission) AS total_comission,
SUM(gst) AS total_gst
FROM (
SELECT FIRST_VALUE(passenger_name) OVER (PARTITION BY invoice_number ORDER BY CASE passenger_name WHEN '' THEN 'Z' ELSE passenger_name END) AS first_passenger,
fare,
commission,
gst,
invoice_number
FROM ##invoice) i
GROUP BY i.invoice_number
July 18, 2017 at 2:05 pm
SQL Server does not guarantee an order unless you specify an order. Which field(s) are you using to specify your order?
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
July 18, 2017 at 2:06 pm
How do you know that 'VERITY/SMITH MRS' should be the correct value returned? There is nothing in the data that will tell you that. Just because 'VERITY/SMITH MRS' is entered first really doesn't mean anything as tables are unordered sets of data. You apply order by using the ORDER BY, as you did in your query which is why 'THOMAS/SMITH MR' is the value returned.
July 18, 2017 at 2:14 pm
I can also add a line to it, but then it will always return the first, empty record as from the example below, no?
IF OBJECT_ID('tempdb..##invoice') IS NOT NULL
DROP TABLE ##invoice;
WITH invoice AS(
SELECT 1 AS invoice_number,
1 AS line,
'' AS passenger_name,
10 AS fare,
1 AS commission,
2 AS gst
UNION ALL
SELECT 1,
2,
'VERITY/SMITH MRS',
100,
8,
20
UNION ALL
SELECT 1,3,
'THOMAS/SMITH MR',
98,
7,
10
UNION ALL
SELECT 1,4,
'',
4,
0,
0
)
SELECT *
INTO ##invoice
FROM invoice;
SELECT MIN(first_passenger) AS first_passenger,
SUM (fare) AS total_fare,
SUM(commission) AS total_comission,
SUM(gst) AS total_gst
FROM (
SELECT FIRST_VALUE(passenger_name) OVER (PARTITION BY invoice_number ORDER BY line, CASE passenger_name WHEN '' THEN 'Z' ELSE passenger_name END) AS first_passenger,
fare,
commission,
gst,
invoice_number
FROM ##invoice) i
GROUP BY i.invoice_number
July 18, 2017 at 2:35 pm
It might just be me, but I think you are making this more complicated than it needs to be.
The way I would do it would be to have your select something like:SELECT TOP 1 passenger_name, fare, commission, gst
FROM ##invoice
WHERE passenger_name not like ''
ORDER BY line
At least, by using your latest example bit of code. Without that "line" value, I don't see how to determine the order.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
July 18, 2017 at 2:44 pm
bmg002 - Tuesday, July 18, 2017 2:35 PMIt might just be me, but I think you are making this more complicated than it needs to be.The way I would do it would be to have your select something like:
SELECT TOP 1 passenger_name, fare, commission, gst
FROM ##invoice
WHERE passenger_name not like ''
ORDER BY line
At least, by using your latest example bit of code. Without that "line" value, I don't see how to determine the order.
I need to sum up fare, commission and gst as well, using TOP 1 will not help, I think.
July 18, 2017 at 2:55 pm
Ok, if you need to sum those up, you would want something more like:SELECT TOP 1 passenger_name, fare + commission + gst AS total_cost
FROM ##invoice
WHERE passenger_name not like ''
ORDER BY line
no? Or do you mean you need to get the sum of all of the fares, commissions and gst's for a particular invoice number? Something like:SELECT TOP 1 passenger_name,
SUM(fare) OVER (PARTITION BY invoice_number),
SUM(commission) OVER (PARTITION BY invoice_number),
SUM(gst) OVER (PARTITION BY invoice_number)
FROM ##invoice
WHERE passenger_name not like ''
ORDER BY line
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
July 18, 2017 at 2:59 pm
You can fix this by changing the ORDER BY in your FIRST_VALUE calculation.
Your first order field should be whether there is a name or not, and then by the line number.
SELECT MIN(first_passenger) AS first_passenger,
SUM (fare) AS total_fare,
SUM(commission) AS total_comission,
SUM(gst) AS total_gst
FROM (
SELECT FIRST_VALUE(passenger_name) OVER (PARTITION BY invoice_number ORDER BY CASE passenger_name WHEN '' THEN 1 ELSE 0 END, line) AS first_passenger,
fare,
commission,
gst,
invoice_number
FROM ##invoice) i
GROUP BY i.invoice_number
The following gives the expected results.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
July 18, 2017 at 2:59 pm
how about something like this?WITH invoice_sums AS
(SELECT invoice_number, passenger_name,
SUM(fare) OVER (PARTITION BY invoice_number) AS total_fare,
SUM(commission) OVER (PARTITION BY invoice_number) AS total_commission,
SUM(gst) OVER (PARTITION BY invoice_number) AS total_gst,
ROW_NUMBER() OVER (PARTITION BY invoice_number ORDER BY CASE WHEN passenger_name = '' THEN 2 ELSE 1 END, line) AS passenger
FROM ##invoice)
SELECT invoice_number, passenger_name, total_fare, total_commission, total_gst
FROM invoice_sums
WHERE passenger = 1;
July 18, 2017 at 3:00 pm
I'm guessing what is desired is for each invoice number, return the sum of each of the fare, commission, and gst columns, along with the "first" non-blank passenger_name associated with that invoice.
That's not so tricky, but we do need a way to determine what passenger_name should be "first"; if the arbitrary line column introduced earlier is an acceptable way to measure who is first, then one of the solutions above should be fine.
Cheers!
July 18, 2017 at 3:15 pm
Drew / Chris
Thank you both. I see now the solution was more than obvious. Massive thanks!
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply