Limiting query results with 3 criteria for each record

  • Hi all,

    I am trying to write a query that gives me the personal records from speed skaters on e.g. the 500 mtrs.

    I do this with the query:

    SELECT cdsDistance AS Distance

    , prsFirstName

    , prsLastName

    , min(crtFinalTime) AS MinTime

    FROM tb....... INNER JOIN etc..

    GROUP BY cdsDistance, prsFirstName, prsLastName

    ORDER BY min(crtFinalTime)

    In itself this works fine.

    However, there are complicating factors. Sometimes a speed skater has multiple PRs, meaning the he/she has the same fastest time more than once.

    If these times are achieved on multple days, the 1st date is the official PR. (meaning: "Min of racedate")

    If they are raced on the same day the 1st race is the PR (meaning: "Min of distancenumber")

    Changing the code to:

    SELECT cdsDistance AS Distance

    , prsFirstName

    , prsLastName

    , MIN(crtFinalTime) AS MinTime

    , MIN(cdsStartDate) AS RaceDate

    , MIN(cdsDistanceNumber) AS DistanceNumber

    FROM tb.......

    GROUP BY cdsDistance, prsFirstName, prsLastName

    ORDER BY min(crtFinalTime)

    This gives me the wrong outcome because it gives me the "MIN" of every field, and they are not necessarily on the same row.

    An option would be to calculate min(crtFinalTime), if for a person there is more than 1 result, calculate min of date, and then (if there is still more than 1 row) min of distancenumber.

    Seems complicated, and I have the feeling there must be a better way (apart from: how to get this code)

    Stacking subqueries in the FROM statement seems like a option be costly (time wise). There are more than 10 million rows (and growing) to run through.

    As an example a few times:

    DistanceFirst nameLast name Time Date Distance nr.

    500 Yuya Oikawa 34.49 201311155

    500 Yuya Oikawa 34.49 201311153

    500 Yuya Oikawa 34.49 201311172

    Yuya has 3 best times (34.49), 15-11-2013 is the 1st date, then distance nr 3 is the 1st distance raced. Therefore the 2nd row is the only row I would like to get in my endresult.

    Can this be done?

    Thanks in advance

    Hein

  • Quick suggestion, use either rank or row_number, here is a sample with the latter

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NULL

    BEGIN

    CREATE TABLE dbo.TBL_SAMPLE_SKATER

    (

    Distance INT NOT NULL

    ,FirstName NVARCHAR(50) NOT NULL

    ,LastName NVARCHAR(50) NOT NULL

    ,FinishTimeSec NUMERIC(12,3) NOT NULL

    ,RaceDate DATE NOT NULL

    ,DistanceNumber INT NOT NULL

    );

    INSERT INTO dbo.TBL_SAMPLE_SKATER

    (

    Distance

    ,FirstName

    ,LastName

    ,FinishTimeSec

    ,RaceDate

    ,DistanceNumber

    )

    VALUES

    (500,N'Yuya',N'Oikawa',34.49,'2013-11-15',5)

    ,(500,N'Yuya',N'Oikawa',34.49,'2013-11-15',3)

    ,(500,N'Yuya',N'Oikawa',34.49,'2013-11-17',2)

    ;

    END

    ;WITH BASE_DATA AS

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY SSK.FirstName

    ,SSK.LastName

    ,SSK.Distance

    ORDER BY SSK.FinishTimeSec ASC

    ,SSK.RaceDate ASC

    ,SSK.DistanceNumber ASC

    ) AS PersonalBest

    ,SSK.Distance

    ,SSK.FirstName

    ,SSK.LastName

    ,SSK.FinishTimeSec

    ,SSK.RaceDate

    ,SSK.DistanceNumber

    FROM dbo.TBL_SAMPLE_SKATER SSK

    )

    SELECT

    BD.PersonalBest

    ,BD.Distance

    ,BD.FirstName

    ,BD.LastName

    ,BD.FinishTimeSec

    ,BD.RaceDate

    ,BD.DistanceNumber

    FROM BASE_DATA BD

    WHERE BD.PersonalBest = 1

    ;

    Results

    PersonalBest Distance FirstName LastName FinishTimeSec RaceDate DistanceNumber

    ------------- ----------- ----------- --------- -------------- ---------- --------------

    1 500 Yuya Oikawa 34.490 2013-11-15 3

  • Hallo Eirikur,

    Thanks for your reply! I was afraid it would go by unnoticed.

    I have used your code and it works flawlessy!

    I am fairly new to SQL Server but I also think I understand what you are doing

    There is 1 thing I can't figure out, maybe you can help me with that

    I have run your code with the desired outcome, after that I have added some record to go into the temp table but the old data (from the 1st run) keeps returning, nothing is added and/or deleted.

    How can I delete this temp table, and create a new 1, with new data in it?

    Over the weekend I have thought (but have not yet worked it out) about maybe another solution.

    I thought that putting the first results in a temp table, and, in 2 steps, delete the rows for those skaters that have mulitple rows in the temp table based on date and distance number, might be a slution.

    However: I like your solution better because it run in 1 go, mine solution would involve 3 runs

    Considering the amount of data, your option seems fastest

    Hein

    PS

    I don't know if you are a follower of speedskating but Norge seems to be one the right track!

    PPS

    I figured it out!

    I made a mistake with the "Begin" part

    I have changed the code to

    IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NOT NULL

    BEGIN

    DROP TABLE TBL_SAMPLE_SKATER

    CREATE TABLE dbo.TBL_SAMPLE_SKATER

    My option that I gave many tries was:

    IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NOT NULL

    BEGIN

    DROP TABLE TBL_SAMPLE_SKATER

    END

    Begin

    CREATE TABLE ...

    I have ended and restarted SQL SERVER, but not the computer (the temp table was still there when I looked after restarting)

    I hope my solution will still work if I restart the computer

    PPPS

    Done, the extra end / begin that i tried many times suddenly seems to work 🙂

    It also works if I add skaters and/or change data that's already present

    Again: many thanks for putting me on the right track.

    If you have any additional suggestions, I am all ears.

  • There are two solutions, either add an explicit drop table statement before the IF or change the logic to drop the table every time if it exists, here is the code with the latter

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'dbo.TBL_SAMPLE_SKATER') IS NOT NULL DROP TABLE dbo.TBL_SAMPLE_SKATER;

    CREATE TABLE dbo.TBL_SAMPLE_SKATER

    (

    Distance INT NOT NULL

    ,FirstName NVARCHAR(50) NOT NULL

    ,LastName NVARCHAR(50) NOT NULL

    ,FinishTimeSec NUMERIC(12,3) NOT NULL

    ,RaceDate DATE NOT NULL

    ,DistanceNumber INT NOT NULL

    );

    INSERT INTO dbo.TBL_SAMPLE_SKATER

    (

    Distance

    ,FirstName

    ,LastName

    ,FinishTimeSec

    ,RaceDate

    ,DistanceNumber

    )

    VALUES

    (500,N'Yuya',N'Oikawa',34.49,'2013-11-15',5)

    ,(500,N'Yuya',N'Oikawa',34.49,'2013-11-15',3)

    ,(500,N'Yuya',N'Oikawa',34.49,'2013-11-17',2)

    ;

    ;WITH BASE_DATA AS

    (

    SELECT

    ROW_NUMBER() OVER

    (

    PARTITION BY SSK.FirstName

    ,SSK.LastName

    ,SSK.Distance

    ORDER BY SSK.FinishTimeSec ASC

    ,SSK.RaceDate ASC

    ,SSK.DistanceNumber ASC

    ) AS PersonalBest

    ,SSK.Distance

    ,SSK.FirstName

    ,SSK.LastName

    ,SSK.FinishTimeSec

    ,SSK.RaceDate

    ,SSK.DistanceNumber

    FROM dbo.TBL_SAMPLE_SKATER SSK

    )

    SELECT

    BD.PersonalBest

    ,BD.Distance

    ,BD.FirstName

    ,BD.LastName

    ,BD.FinishTimeSec

    ,BD.RaceDate

    ,BD.DistanceNumber

    FROM BASE_DATA BD

    WHERE BD.PersonalBest = 1

    ;

  • Once again thank you,

    I will change the code the way you suggested.

    Thanks for putting me on the right track.

    I will set this thread to solved

    Hein

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply