March 20, 2015 at 1:41 pm
Hi all,
I am trying to write a query that gives me the personal records from speed skaters on e.g. the 500 mtrs.
I do this with the query:
SELECT cdsDistance AS Distance
, prsFirstName
, prsLastName
, min(crtFinalTime) AS MinTime
FROM tb....... INNER JOIN etc..
GROUP BY cdsDistance, prsFirstName, prsLastName
ORDER BY min(crtFinalTime)
In itself this works fine.
However, there are complicating factors. Sometimes a speed skater has multiple PRs, meaning the he/she has the same fastest time more than once.
If these times are achieved on multple days, the 1st date is the official PR. (meaning: "Min of racedate")
If they are raced on the same day the 1st race is the PR (meaning: "Min of distancenumber")
Changing the code to:
SELECT cdsDistance AS Distance
, prsFirstName
, prsLastName
, MIN(crtFinalTime) AS MinTime
, MIN(cdsStartDate) AS RaceDate
, MIN(cdsDistanceNumber) AS DistanceNumber
FROM tb.......
GROUP BY cdsDistance, prsFirstName, prsLastName
ORDER BY min(crtFinalTime)
This gives me the wrong outcome because it gives me the "MIN" of every field, and they are not necessarily on the same row.
An option would be to calculate min(crtFinalTime), if for a person there is more than 1 result, calculate min of date, and then (if there is still more than 1 row) min of distancenumber.
Seems complicated, and I have the feeling there must be a better way (apart from: how to get this code)
Stacking subqueries in the FROM statement seems like a option be costly (time wise). There are more than 10 million rows (and growing) to run through.
As an example a few times:
DistanceFirst nameLast name Time Date Distance nr.
500 Yuya Oikawa 34.49 201311155
500 Yuya Oikawa 34.49 201311153
500 Yuya Oikawa 34.49 201311172
Yuya has 3 best times (34.49), 15-11-2013 is the 1st date, then distance nr 3 is the 1st distance raced. Therefore the 2nd row is the only row I would like to get in my endresult.
Can this be done?
Thanks in advance
Hein
March 22, 2015 at 5:56 am
Quick suggestion, use either rank or row_number, here is a sample with the latter
😎
USE tempdb;
GO
SET NOCOUNT ON;
IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NULL
BEGIN
CREATE TABLE dbo.TBL_SAMPLE_SKATER
(
Distance INT NOT NULL
,FirstName NVARCHAR(50) NOT NULL
,LastName NVARCHAR(50) NOT NULL
,FinishTimeSec NUMERIC(12,3) NOT NULL
,RaceDate DATE NOT NULL
,DistanceNumber INT NOT NULL
);
INSERT INTO dbo.TBL_SAMPLE_SKATER
(
Distance
,FirstName
,LastName
,FinishTimeSec
,RaceDate
,DistanceNumber
)
VALUES
(500,N'Yuya',N'Oikawa',34.49,'2013-11-15',5)
,(500,N'Yuya',N'Oikawa',34.49,'2013-11-15',3)
,(500,N'Yuya',N'Oikawa',34.49,'2013-11-17',2)
;
END
;WITH BASE_DATA AS
(
SELECT
ROW_NUMBER() OVER
(
PARTITION BY SSK.FirstName
,SSK.LastName
,SSK.Distance
ORDER BY SSK.FinishTimeSec ASC
,SSK.RaceDate ASC
,SSK.DistanceNumber ASC
) AS PersonalBest
,SSK.Distance
,SSK.FirstName
,SSK.LastName
,SSK.FinishTimeSec
,SSK.RaceDate
,SSK.DistanceNumber
FROM dbo.TBL_SAMPLE_SKATER SSK
)
SELECT
BD.PersonalBest
,BD.Distance
,BD.FirstName
,BD.LastName
,BD.FinishTimeSec
,BD.RaceDate
,BD.DistanceNumber
FROM BASE_DATA BD
WHERE BD.PersonalBest = 1
;
Results
PersonalBest Distance FirstName LastName FinishTimeSec RaceDate DistanceNumber
------------- ----------- ----------- --------- -------------- ---------- --------------
1 500 Yuya Oikawa 34.490 2013-11-15 3
March 22, 2015 at 1:11 pm
Hallo Eirikur,
Thanks for your reply! I was afraid it would go by unnoticed.
I have used your code and it works flawlessy!
I am fairly new to SQL Server but I also think I understand what you are doing
There is 1 thing I can't figure out, maybe you can help me with that
I have run your code with the desired outcome, after that I have added some record to go into the temp table but the old data (from the 1st run) keeps returning, nothing is added and/or deleted.
How can I delete this temp table, and create a new 1, with new data in it?
Over the weekend I have thought (but have not yet worked it out) about maybe another solution.
I thought that putting the first results in a temp table, and, in 2 steps, delete the rows for those skaters that have mulitple rows in the temp table based on date and distance number, might be a slution.
However: I like your solution better because it run in 1 go, mine solution would involve 3 runs
Considering the amount of data, your option seems fastest
Hein
PS
I don't know if you are a follower of speedskating but Norge seems to be one the right track!
PPS
I figured it out!
I made a mistake with the "Begin" part
I have changed the code to
IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NOT NULL
BEGIN
DROP TABLE TBL_SAMPLE_SKATER
CREATE TABLE dbo.TBL_SAMPLE_SKATER
My option that I gave many tries was:
IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NOT NULL
BEGIN
DROP TABLE TBL_SAMPLE_SKATER
END
Begin
CREATE TABLE ...
I have ended and restarted SQL SERVER, but not the computer (the temp table was still there when I looked after restarting)
I hope my solution will still work if I restart the computer
PPPS
Done, the extra end / begin that i tried many times suddenly seems to work 🙂
It also works if I add skaters and/or change data that's already present
Again: many thanks for putting me on the right track.
If you have any additional suggestions, I am all ears.
March 22, 2015 at 2:52 pm
There are two solutions, either add an explicit drop table statement before the IF or change the logic to drop the table every time if it exists, here is the code with the latter
😎
USE tempdb;
GO
SET NOCOUNT ON;
IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NOT NULL DROP TABLE dbo.TBL_SAMPLE_SKATER;
CREATE TABLE dbo.TBL_SAMPLE_SKATER
(
Distance INT NOT NULL
,FirstName NVARCHAR(50) NOT NULL
,LastName NVARCHAR(50) NOT NULL
,FinishTimeSec NUMERIC(12,3) NOT NULL
,RaceDate DATE NOT NULL
,DistanceNumber INT NOT NULL
);
INSERT INTO dbo.TBL_SAMPLE_SKATER
(
Distance
,FirstName
,LastName
,FinishTimeSec
,RaceDate
,DistanceNumber
)
VALUES
(500,N'Yuya',N'Oikawa',34.49,'2013-11-15',5)
,(500,N'Yuya',N'Oikawa',34.49,'2013-11-15',3)
,(500,N'Yuya',N'Oikawa',34.49,'2013-11-17',2)
;
;WITH BASE_DATA AS
(
SELECT
ROW_NUMBER() OVER
(
PARTITION BY SSK.FirstName
,SSK.LastName
,SSK.Distance
ORDER BY SSK.FinishTimeSec ASC
,SSK.RaceDate ASC
,SSK.DistanceNumber ASC
) AS PersonalBest
,SSK.Distance
,SSK.FirstName
,SSK.LastName
,SSK.FinishTimeSec
,SSK.RaceDate
,SSK.DistanceNumber
FROM dbo.TBL_SAMPLE_SKATER SSK
)
SELECT
BD.PersonalBest
,BD.Distance
,BD.FirstName
,BD.LastName
,BD.FinishTimeSec
,BD.RaceDate
,BD.DistanceNumber
FROM BASE_DATA BD
WHERE BD.PersonalBest = 1
;
March 22, 2015 at 4:31 pm
Once again thank you,
I will change the code the way you suggested.
Thanks for putting me on the right track.
I will set this thread to solved
Hein
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply