UNION a series of TOP statements

  • Greetings all,

    I need to select the top 5 rows for particular codes and then union and sort the results. I am using a derived table because each top statement needs an ORDER BY statement which is not allowed in UNION statements. See my sample data and my select statement below.

    With my production data the results take about 2 minutes to return BUT each of the individual TOP statements takes 3 seconds. I guess I could insert each individual TOP statement into a temp table but I'd prefer not to go that route.

    Can anyone suggest any alternatives?

    CREATE TABLE [dbo].[Test](

    [Code] [char](1) NULL,

    [AsofDate] [datetime] NULL,

    [MyValue] [int] NULL

    ) ON [PRIMARY]

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/8/2010',1)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/9/2010',2)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/10/2010',3)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/11/2010',4)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/12/2010',5)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/13/2010',6)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/14/2010',7)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/15/2010',8)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/16/2010',9)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/17/2010',10)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/18/2010',11)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('A','2/19/2010',12)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/8/2010',1)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/9/2010',2)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/10/2010',3)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/11/2010',4)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/12/2010',5)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/13/2010',6)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/14/2010',7)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/15/2010',8)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/16/2010',9)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/17/2010',10)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/18/2010',11)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('B','2/19/2010',12)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/8/2010',1)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/9/2010',2)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/10/2010',3)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/11/2010',4)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/12/2010',5)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/13/2010',6)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/14/2010',7)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/15/2010',8)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/16/2010',9)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/17/2010',10)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/18/2010',11)

    INSERT INTO TEST ([Code], [AsofDate], [MyValue]) VALUES('C','2/19/2010',12)

    SELECT

    [Code],

    [AsofDate],

    [MyValue]

    FROM

    (

    SELECT TOP 5

    [Code],

    [AsofDate],

    [MyValue]

    FROM Test

    WHERE Code = 'A'

    AND AsofDate < dateadd(dd,0, datediff(dd,0,getdate()))--Don't select current or future days

    AND DATEPART(weekday,AsofDate) NOT IN (7,1) --Don't select weekends

    ORDER BY AsofDate DESC

    ) AS A

    UNION ALL

    SELECT

    [Code],

    [AsofDate],

    [MyValue]

    FROM

    (

    SELECT TOP 5

    [Code],

    [AsofDate],

    [MyValue]

    FROM Test

    WHERE Code = 'B'

    AND AsofDate < dateadd(dd,0, datediff(dd,0,getdate()))--Don't select current or future days

    AND DATEPART(weekday,AsofDate) NOT IN (7,1) --Don't select weekends

    ORDER BY AsofDate DESC

    ) AS B

    UNION ALL

    SELECT

    [Code],

    [AsofDate],

    [MyValue]

    FROM

    (

    SELECT TOP 5

    [Code],

    [AsofDate],

    [MyValue]

    FROM Test

    WHERE Code = 'C'

    AND AsofDate < dateadd(dd,0, datediff(dd,0,getdate()))--Don't select current or future days

    AND DATEPART(weekday,AsofDate) NOT IN (7,1) --Don't select weekends

    ORDER BY AsofDate DESC

    ) AS C

    ORDER BY AsofDate, Code

    DROP TABLE Test

  • Try using RowNumber instead

    SELECT [Code],

    [AsofDate],

    [MyValue]

    FROM (

    SELECT [Code],

    [AsofDate],

    [MyValue],

    ROW_NUMBER() OVER (PARTITION BY Code ORDER BY AsofDate DESC) AS RowNo

    FROM Test

    AND AsofDate < dateadd(dd,0, datediff(dd,0,getdate()))--Don't select current or future days

    AND DATEPART(weekday,AsofDate) NOT IN (7,1) --Don't select weekends

    ) sub

    WHERE RowNo <= 5

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Wow, subsecond! thanks.

  • My pleasure.

    I can probably get it even faster if you want, with some appropriate indexes.

    If you're interested, what are the current indexes on the table that you're running this against?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm actually querying a view in a vendor provided database. The underlying table seems properly indexed from a cursory review. I won't be able to change any of the indexing on the underlying table so I don't need to proceed but thank you for the generous offer of your time.

  • The ROW_NUMBER() method will often be the fastest available, and a very consistent performer, but there are cases where a query based on an APPLY might be faster still. This would generally be where a list of unique 'code' values is available, perhaps from a reference table, and where the TOP (n) rows for each value of 'code' can be retrieved using an index seek - but only if the required number of rows is much less than the number of rows in the table for each code.

    For completeness then, here is sample code for the APPLY solution:

    -- Week starts on Monday

    SET DATEFIRST 1;

    -- The instant 'today' started

    DECLARE @TodayMidnight DATETIME;

    SET @TodayMidnight = DATEADD(DAY, 0, DATEDIFF(DAY, 0, CURRENT_TIMESTAMP));

    SELECT Codes.code,

    Top5.AsofDate,

    Top5.MyValue

    FROM (

    -- Unique values for 'code

    SELECT DISTINCT Code

    FROM Test

    )

    AS Codes (code)

    CROSS

    APPLY (

    SELECT TOP (5)

    AsofDate, MyValue

    FROM Test T

    WHERE T.Code = Codes.code

    AND AsofDate < @TodayMidnight -- Only past days

    AND DATEPART(WEEKDAY, AsofDate) BETWEEN 1 AND 5 -- Not weekends

    ORDER BY AsofDate DESC

    ) Top5;

    Paul

  • When using DATEPART(WEEKDAY, AsofDate) you need to make sure to have DATEFIRST set to 7. Otherwise you'll get incorrect results.

    (see one of the recent articles: "Finding the Correct Weekday Regardless of DateFirst")



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • lmu92 (2/21/2010)


    When using DATEPART(WEEKDAY, AsofDate) you need to make sure to have DATEFIRST set to 7. Otherwise you'll get incorrect results.

    Lutz,

    You sure about that?

    Monday is the first day of the week for me, hence the SET DATEFIRST 1 statement. When excluding the weekend, I look for DATEPART(WEEKDAY, AsofDate) BETWEEN 1 AND 5.

    I look forward to seeing your code to prove that this is incorrect and the DATEPART only works for a magic DATEFIRST value of 7. 😛

    Paul

  • Sorry Gail, but I confuse easily. In the code you posted:

    SELECT [Code],

    [AsofDate],

    [MyValue]

    FROM (

    SELECT [Code],

    [AsofDate],

    [MyValue],

    ROW_NUMBER() OVER (PARTITION BY Code ORDER BY AsofDate DESC) AS RowNo

    FROM Test

    AND AsofDate < dateadd(dd,0, datediff(dd,0,getdate()))--Don't select current or future days

    AND DATEPART(weekday,AsofDate) NOT IN (7,1) --Don't select weekends

    ) sub

    WHERE RowNo <= 5

    is there a WHERE keyword missing after "FROM Test"?

  • Jim Russell-390299 (2/22/2010)


    is there a WHERE keyword missing after "FROM Test"?

    The first AND is a typo...is should be a WHERE.

  • Paul White (2/21/2010)


    lmu92 (2/21/2010)


    When using DATEPART(WEEKDAY, AsofDate) you need to make sure to have DATEFIRST set to 7. Otherwise you'll get incorrect results.

    Lutz,

    You sure about that?

    Monday is the first day of the week for me, hence the SET DATEFIRST 1 statement. When excluding the weekend, I look for DATEPART(WEEKDAY, AsofDate) BETWEEN 1 AND 5.

    I look forward to seeing your code to prove that this is incorrect and the DATEPART only works for a magic DATEFIRST value of 7. 😛

    Paul

    :blush: I missed the SET DATEFIRST part in your code and I also screwed up regarding DATEFIRST =7 and DATEPART(WEEKDAY, DateVal)=1 to be Monday... :blush:

    So, let me rephrase my totally wrong statement to:

    When using DATEPART(WEEKDAY, DateVal) function it is essential to control the value of both: DATEFIRST and LANGUAGE within the same batch.

    Example:

    SET DATEFIRST 1

    SET LANGUAGE us_english

    SELECT datepart(dw,'20091231'),datename(dw,'20091231')

    --Returns 4 , Thursday,

    --whereas

    SET DATEFIRST 1

    go

    SET LANGUAGE us_english

    SELECT datepart(dw,'20091231'),datename(dw,'20091231')

    --Returns 5 , Thursday.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • lmu92 (2/22/2010)


    :blush: I missed the SET DATEFIRST part in your code and I also screwed up regarding DATEFIRST =7 and DATEPART(WEEKDAY, DateVal)=1 to be Monday... :blush:

    So, let me rephrase my totally wrong statement to:

    When using DATEPART(WEEKDAY, DateVal) function it is essential to control the value of both: DATEFIRST and LANGUAGE within the same batch.

    Lutz,

    No worries. That example code would make a good Question of the Day on this site! The interaction of DATEFIRST and LANGUAGE is not at all intuitive, I would say.

    I'm not sure it is completely correct to say "it is essential to control the value of both: DATEFIRST and LANGUAGE within the same batch". SET LANGUAGE does set DATEFIRST implicitly, but it does not override it (regardless of the written order of the statements), as your code shows.

    If I were being picky (and it has been known to happen), I would say that it is essential to use SET DATEFIRST when using the WEEKDAY option of DATEPART. You could use LANGUAGE instead, but that is rather indirect and non-obvious, and could be overridden by a SET DATEFIRST statement anywhere in the same batch. Have I got that right?

    Paul

  • It's interesting to see that SET DATEFIRST actually takes precedence over SET LANGUAGE. I couldn't find any hint about it in BOL so far...

    Here's an example:

    If you change the language setting without prior setting of datefirst, the datefirst value will be changed, too. But as soon as you have a SET DATEFIRST statement in your code, any further change of the LANGUAGE value won't influence the DATEFIRST value anymore.

    So, you were right in terms of my statement being not correct as well as in terms of DATEFIRST overrides LANGUAGE. I could argue about ...overridden by a SET DATEFIRST statement anywhere in the same batch... but I think I know what you mean. 😉

    SET LANGUAGE us_english

    SELECT 'english',datepart(dw,'20091231'),datename(dw,'20091231'), @@datefirst

    SET LANGUAGE german

    SELECT 'german',datepart(dw,'20091231'),datename(dw,'20091231') , @@datefirst

    SET DATEFIRST 7

    SELECT 'german, but DATEFIRST = 7',datepart(dw,'20091231'),datename(dw,'20091231') , @@datefirst

    SET LANGUAGE us_english

    SELECT 'english',datepart(dw,'20091231'),datename(dw,'20091231') , @@datefirst

    SET LANGUAGE german

    SELECT 'german',datepart(dw,'20091231'),datename(dw,'20091231') , @@datefirst

    SET LANGUAGE us_english



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • lmu92 (2/23/2010)


    It's interesting to see that SET DATEFIRST actually takes precedence over SET LANGUAGE. I couldn't find any hint about it in BOL so far...

    Me either. It isn't explicitly documented as far as I can see.

    lmu92 (2/23/2010)So, you were right in terms of my statement being not correct as well as in terms of DATEFIRST overrides LANGUAGE. I could argue about ...overridden by a SET DATEFIRST statement anywhere in the same batch... but I think I know what you mean. 😉

    I do, so I'd like to change my previous statement, to see if we can agree on a reasonably full and accurate description of the behaviour:


    SET LANGUAGE implicitly sets the value of DATEFIRST, unless an explicit setting for DATEFIRST appears before it in the same batch.

    SET DATEFIRST overrides the default or implicit DATEFIRST value from the point SET DATEFIRST appears in the batch.

    All settings persist for the life of the connection (unless modified again!)


    Does that seem right to you? I'd like to leave the thread with as good a description of this as we can muster 🙂

    Paul

    edit: based on feedback from Lutz

  • I have written a post on DATEFIRT which can help you understand the usage of it..

    http://www.sqlservercentral.com/articles/DateFirst/69203/

    --Divya

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply