grouping without aggregates

  • Hello,

    I have a table of dates, accounts and values.  I am trying to get the earliest occurance of each account and its value.

    data:

    12/31/2002,PARENT,'RECORDCODE1',ACCT1,10000

    01/31/2003,PARENT,'RECORDCODE1',ACCT1,12000

    02/28/2003,PARENT,'RECORDCODE1',ACCT1,13000

    03/30/2003,PARENT,'RECORDCODE1',ACCT1,14000

    04/30/2003,PARENT,'RECORDCODE1',ACCT1,15000

    05/30/2003,PARENT,'RECORDCODE1',ACCT1,10000

    07/30/2005,PARENT,'RECORDCODE2',ACCT2,10000

    08/31/2005,PARENT,'RECORDCODE2',ACCT2,12000

    09/28/2005,PARENT,'RECORDCODE2',ACCT2,13000

    10/30/2005,PARENT,'RECORDCODE2',ACCT2,14000

    10/30/2005,PARENT,'RECORDCODE2',ACCT2,15000

    11/30/2005,PARENT,'RECORDCODE2',ACCT2,10000

    etc,etc...

    So I just cant seem to figure out how to get the 12/31/2002, ACCT1 record and the 07/30/2005, ACCT2 record.

    So there are 10517 total records in the system.  If I use this query:

    SELECT MIN(DATE)

     ,PARENT

     ,RECORDCODE

     ,MIN(QUANTITY)

    FROM TABLE

    GROUP BY RECORDCODE,PARENT          --,QUANTITY

     ORDER BY DATE,RECORDCODE

    I am able to get the 406 unique occurances of the EARLIEST dates.  The problem is that I have to select the aggregate QUANTITY which skews the numbers.  If I want to get the real number I have to remove the MIN() but then I get all 10k records.

    What am I missing on this?!?

    thanks,

    Chris

     

  • Hi Chris

    This should do the trick.

    CREATE TABLE #MSG369550 (AccountDate DATETIME, parent CHAR(6), RecordCode CHAR(11), AccountName CHAR(5), Quantity INT)

    INSERT INTO #MSG369550 VALUES ('12/31/2002','PARENT','RECORDCODE1','ACCT1',10000)

    INSERT INTO #MSG369550 VALUES ('01/31/2003','PARENT','RECORDCODE1','ACCT1',12000)

    INSERT INTO #MSG369550 VALUES ('02/28/2003','PARENT','RECORDCODE1','ACCT1',13000)

    INSERT INTO #MSG369550 VALUES ('03/30/2003','PARENT','RECORDCODE1','ACCT1',14000)

    INSERT INTO #MSG369550 VALUES ('04/30/2003','PARENT','RECORDCODE1','ACCT1',15000)

    INSERT INTO #MSG369550 VALUES ('05/30/2003','PARENT','RECORDCODE1','ACCT1',10000)

    INSERT INTO #MSG369550 VALUES ('07/30/2005','PARENT','RECORDCODE2','ACCT2',10000)

    INSERT INTO #MSG369550 VALUES ('08/31/2005','PARENT','RECORDCODE2','ACCT2',12000)

    INSERT INTO #MSG369550 VALUES ('09/28/2005','PARENT','RECORDCODE2','ACCT2',13000)

    INSERT INTO #MSG369550 VALUES ('10/30/2005','PARENT','RECORDCODE2','ACCT2',14000)

    INSERT INTO #MSG369550 VALUES ('10/30/2005','PARENT','RECORDCODE2','ACCT2',15000)

    INSERT INTO #MSG369550 VALUES ('11/30/2005','PARENT','RECORDCODE2','ACCT2',10000)

    SELECT a.*

    FROM #MSG369550 a

    INNER JOIN (

     SELECT MIN(AccountDate) AS MinAccountDate, AccountName

     FROM #MSG369550

     GROUP BY AccountName) d

    ON d.AccountName = a.AccountName and d.MinAccountDate = a.AccountDate

    DROP TABLE #MSG369550

     

    Cheers

    ChrisM

     

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris,

    Well, your solution was certainly more efficient than mine using

    SELECT Row_NUMBER() OVER (Partition BY CUSIP ORDER BY EOM_DATE) AS OrderRank

    stuffing the results into a @temp and then selecting the items where OrderRank=1

    I defintally like your solution.  What bugs me is that I knew that I could probably do it with a derived table/join and just couldnt figure the darn thing.

    thanks,

    Chris

     

     

  • Just watch out for performance on this one.

    To process the join to itself by date, SQL is going to probably use a cursor internally and this may cause you some performance issues if your table gets large enough.

    There are lots of ways to do this - and the one posted is a pretty good way - just check your execution plan and performance to make sure this will work for your data set.

  • here's another way using a correlated subquery in the where clause

    (see Chris's post above for the table structure and data)

    select * from #MSG369550 a

    where accountdate = (select min(accountdate) from #MSG369550

       where a.accountname  = accountname)

  • Chris,

    Turn that SELECT into a CTE instead of using a table variable...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff

    We're on SQL2k here. Can't wait to play with CTE on the next contract though.

    Regarding matching on dates - good point. If an identity column is available, the same query would require very little amendment to join on id column instead of date.

     

    Cheers

     

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Ok, Chris... but the OP did post some SQL Server 2k5 code.

    {Edit}... sorry misread that... YOU have SQL Server 2k... I get it...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Not to worry... no internal cursor...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply