Find Oldest Row Per Group?

  • This is my current script...

    ----------------------------------------

    SELECT Distinct

    "user_view"."fullName" as 'DRNAME',

    "FOLL_UP"."folder" as 'ACCT',

    "FOLL_UP"."pname" as 'PTNAME',

    case

    when def_type = 1 then 'Example1'

    when def_type = 2 then 'Example2'

    when def_type = 3 then 'Example3'

    end as "DTYPE",

    "DOCUMENT_NAMES"."TAG" as "DOCUMET NAME",

    e.pt_type as "PatType",

    DATEDIFF (day,"FOLL_UP"."AGEDATE", getdate()) as 'DAYS_SINCE',

    "FOLL_UP"."AGEDATE" as 'DATE'

    FROM "cab"."dbo"."user_view" "user_view" INNER JOIN "cab"."dbo"."FOLL_UP" "FOLL_UP" ON "user_view"."name"="FOLL_UP"."QUEUE"

    INNER JOIN "cab"."dbo"."DOCUMENT_NAMES" "DOCUMENT_NAMES" ON "FOLL_UP"."DOCTYPE_ID" = "DOCUMENT_NAMES"."DOCTYPE_ID" and "DOCUMENT_NAMES"."SET_ID" = 3

    INNER JOIN DB1..enc e (nolock) on e.enc = "FOLL_UP".folder and e.facility="FOLL_UP".facility

    WHERE "FOLL_UP"."FACILITY"='001' AND "FOLL_UP"."AGEDATE"<{ts '2012-03-20 00:00:00'} AND "FOLL_UP"."STATUS_ID"=2

    ORDER BY 'DRNAME' asc

    -----------------------------------------

    Results in..

    DRNAME, ACCT, PTNAME, DTYPE, DOCNAME, PATTYPE, Days Since, DATE

    Dr1, 1, PTNAME1, DTYPE1, DOCNAME1, PATTYPE1, Days Since = 20, SomeDate

    Dr1, 2, PTNAME2, DTYPE2, DOCNAME2, PATTYPE2, Days Since = 25, SomeDate

    Dr1, 3, PTNAME3, DTYPE3, DOCNAME3, PATTYPE3, Days Since = 10, SomeDate

    Dr2, 1, PTNAME1, DTYPE1, DOCNAME1, PATTYPE1, Days Since = 30, SomeDate

    Dr2, 2, PTNAME2, DTYPE2, DOCNAME2, PATTYPE2, Days Since = 19, SomeDate

    Dr2, 3, PTNAME3, DTYPE3, DOCNAME3, PATTYPE3, Days Since = 11, SomeDate

    Dr3, 1, PTNAME1, DTYPE1, DOCNAME1, PATTYPE1, Days Since = 2, SomeDate

    Dr3, 2, PTNAME2, DTYPE2, DOCNAME2, PATTYPE2, Days Since = 4, SomeDate

    Dr3, 3, PTNAME3, DTYPE3, DOCNAME3, PATTYPE3, Days Since = 5, SomeDate

    ------------------------------------------

    What I need to be able to get is just the following (Oldest Days Since per DRNAME Group)

    Dr1...., Days Since = 25

    Dr2...., Days Since = 30

    Dr3...., Days Since = 5

    I have been trying to figure this out using TOP and group by with no luck... I am new to this. Sorry if this is obvious.

    Thank you!

  • When you say you've had no luck, do you mean you got an error message, or did you get the wrong results? What did you try? Here's a hint: you need to use the whole expression in your MAX function and GROUP BY clause, not just the alias you've given them in your query.

    John

  • Can you provide an example. All my attempts were last week and I did not save them. I just kept getting such and such field is not included in aggregate group by and must be included.etc,etc. Something to that nature.

  • How about some ddl and sample data? Then post what you have tried. The error message is pretty self explanatory. When using aggregates all non-aggregated columns must be included in the group by.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Can't mess with your original query without lots of DDL, but I can show you how to take it from where you are now to where you want to be (with this):

    DECLARE @t TABLE

    (DRNAME VARCHAR(25), ACCT VARCHAR(25), PTNAME VARCHAR(25), DTYPE VARCHAR(25), DOCNAME VARCHAR(25)

    , PATTYPE VARCHAR(25), [Days Since] VARCHAR(25), DATE VARCHAR(25))

    INSERT INTO @t

    SELECT 'Dr1', 1, 'PTNAME1','DTYPE1','DOCNAME1','PATTYPE1','Days Since = 20','SomeDate'

    UNION ALL SELECT 'Dr1', 2, 'PTNAME2','DTYPE2','DOCNAME2','PATTYPE2','Days Since = 25','SomeDate'

    UNION ALL SELECT 'Dr1', 3, 'PTNAME3','DTYPE3','DOCNAME3','PATTYPE3','Days Since = 10','SomeDate'

    UNION ALL SELECT 'Dr2', 1, 'PTNAME1','DTYPE1','DOCNAME1','PATTYPE1','Days Since = 30','SomeDate'

    UNION ALL SELECT 'Dr2', 2, 'PTNAME2','DTYPE2','DOCNAME2','PATTYPE2','Days Since = 19','SomeDate'

    UNION ALL SELECT 'Dr2', 3, 'PTNAME3','DTYPE3','DOCNAME3','PATTYPE3','Days Since = 11','SomeDate'

    UNION ALL SELECT 'Dr3', 1, 'PTNAME1','DTYPE1','DOCNAME1','PATTYPE1','Days Since = 2','SomeDate'

    UNION ALL SELECT 'Dr3', 2, 'PTNAME2','DTYPE2','DOCNAME2','PATTYPE2','Days Since = 4','SomeDate'

    UNION ALL SELECT 'Dr3', 3, 'PTNAME3','DTYPE3','DOCNAME3','PATTYPE3','Days Since = 5','SomeDate'

    ;WITH cte AS (

    SELECT *, ROW_NUMBER() OVER (PARTITION BY DRNAME

    ORDER BY DRNAME, CAST(SUBSTRING([Days Since], 14, LEN([Days Since])) AS INT) DESC) As rk

    FROM @t)

    SELECT * FROM cte WHERE rk=1

    Edit: Be kind to your DB server and figure out a way to do without DISTINCT in your original query to improve performance. I see this over and over and it is usually there because someone didn't take the time to analyze the underlying data to determine if duplicates are even possible. If they are not, DISTINCT causes a performance hit.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • That looks great! Thanks Dwain I will give it a shot. I am really new to TSQL but should pick this up quick.

  • As another thought you should read this article about using the nolock hint.

    http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx

    Basically, you should not use it, ever. Dirty reads can be acceptable in some situations but isolation levels are far better than using the nolock hint. It can lead to missing and/or duplicate data.

    Nice example Dwain. I think the OP saw how it works and can apply to their situation. 🙂

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (4/24/2012)


    As another thought you should read this article about using the nolock hint.

    http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx

    Basically, you should not use it, ever. Dirty reads can be acceptable in some situations but isolation levels are far better than using the nolock hint. It can lead to missing and/or duplicate data.

    I will play devils advocate here and say NOLOCK definitely has a place. Often when it is used there is NO chance of missing/duplicate committed data. And using isolation level of READ UNCOMMITTED gets you the same problems. Others that allow concurrent reads open up other issues that come from a) potentially beating the crap out of temp db and b) optimistic locking.

    Use the right tool for the job, and NOLOCK is a valid tool in our toolbox. "should not use it, ever" only applies to one thing that I know of in SQL Server.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (4/25/2012)


    Sean Lange (4/24/2012)


    As another thought you should read this article about using the nolock hint.

    http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx

    Basically, you should not use it, ever. Dirty reads can be acceptable in some situations but isolation levels are far better than using the nolock hint. It can lead to missing and/or duplicate data.

    I will play devils advocate here and say NOLOCK definitely has a place. Often when it is used there is NO chance of missing/duplicate committed data. And using isolation level of READ UNCOMMITTED gets you the same problems. Others that allow concurrent reads open up other issues that come from a) potentially beating the crap out of temp db and b) optimistic locking.

    Use the right tool for the job, and NOLOCK is a valid tool in our toolbox. "should not use it, ever" only applies to one thing that I know of in SQL Server.

    True that everything in SQL server has valid use based on the ever elusive "it depends". This one in particular is so over used/abused by people who simply think it is a magic "go fast" feature that I suggest they don't use it. It is often by used by people who don't understand what it does and some of the really interesting things it can produce.

    What in your opinion is the "one" thing to never use?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • auto shrink 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • LOL that would certainly be a good candidate as a "feature" to remove in some version.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • dwain.c (4/23/2012)


    Can't mess with your original query without lots of DDL, but I can show you how to take it from where you are now to where you want to be (with this):

    DECLARE @t TABLE

    (DRNAME VARCHAR(25), ACCT VARCHAR(25), PTNAME VARCHAR(25), DTYPE VARCHAR(25), DOCNAME VARCHAR(25)

    , PATTYPE VARCHAR(25), [Days Since] VARCHAR(25), DATE VARCHAR(25))

    INSERT INTO @t

    SELECT 'Dr1', 1, 'PTNAME1','DTYPE1','DOCNAME1','PATTYPE1','Days Since = 20','SomeDate'

    UNION ALL SELECT 'Dr1', 2, 'PTNAME2','DTYPE2','DOCNAME2','PATTYPE2','Days Since = 25','SomeDate'

    UNION ALL SELECT 'Dr1', 3, 'PTNAME3','DTYPE3','DOCNAME3','PATTYPE3','Days Since = 10','SomeDate'

    UNION ALL SELECT 'Dr2', 1, 'PTNAME1','DTYPE1','DOCNAME1','PATTYPE1','Days Since = 30','SomeDate'

    UNION ALL SELECT 'Dr2', 2, 'PTNAME2','DTYPE2','DOCNAME2','PATTYPE2','Days Since = 19','SomeDate'

    UNION ALL SELECT 'Dr2', 3, 'PTNAME3','DTYPE3','DOCNAME3','PATTYPE3','Days Since = 11','SomeDate'

    UNION ALL SELECT 'Dr3', 1, 'PTNAME1','DTYPE1','DOCNAME1','PATTYPE1','Days Since = 2','SomeDate'

    UNION ALL SELECT 'Dr3', 2, 'PTNAME2','DTYPE2','DOCNAME2','PATTYPE2','Days Since = 4','SomeDate'

    UNION ALL SELECT 'Dr3', 3, 'PTNAME3','DTYPE3','DOCNAME3','PATTYPE3','Days Since = 5','SomeDate'

    ;WITH cte AS (

    SELECT *, ROW_NUMBER() OVER (PARTITION BY DRNAME

    ORDER BY DRNAME, CAST(SUBSTRING([Days Since], 14, LEN([Days Since])) AS INT) DESC) As rk

    FROM @t)

    SELECT * FROM cte WHERE rk=1

    Edit: Be kind to your DB server and figure out a way to do without DISTINCT in your original query to improve performance. I see this over and over and it is usually there because someone didn't take the time to analyze the underlying data to determine if duplicates are even possible. If they are not, DISTINCT causes a performance hit.

    In my table [Days Since] starts out as an INT not a VarChar. As a result I get the following error message "Argument data type int is invalid for argument 1 of substring function."

  • I FINALLY GOT IT! It turns out that the original query you gave me was ordering by MD Name and not Days Since.

    So here is the final query! Thank you Thank you for your help!

    ;WITH cte AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY [MD NAME] ORDER BY [DAYS_SINCE_ASSIGNED] desc,

    CAST (SUBSTRING(CAST([DAYS_SINCE_ASSIGNED] as varchar(38)), 14, LEN ([DAYS_SINCE_ASSIGNED])) AS INT) DESC) AS rk

    FROM [cabinet].[dbo].[vw_ToBeCompleted])

    SELECT * FROM cte where rk=1

  • Glad you got it so quickly.

    The issue about datatype of Days Since is a good demonstration of the need to provide DDL and sample data in a readily consumable format. Since you didn't, I took liberties and assumed that Days Since included that string in your sample.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply