Query help for distinct

  • I have a list of dates and account strings over a 3 year period. Here's a brief example:

    select effective_begin_date, account

    from tmp_backoffice

    order by effective_begin_date;

    effective_begin_dateaccount

    2007-02-07 10:52:00.0001-086-50-30114-2563

    2007-02-07 10:53:00.0001-086-50-50460-1394

    2007-02-07 10:53:00.0001-086-50-50460-1305

    2007-02-07 10:54:00.0001-086-50-50460-1394

    2007-02-08 11:18:00.0001-086-50-50460-1394

    2007-02-13 10:31:00.0001-086-10-95110-0130

    I want a query that extracts only the dates and accounts when the account has changed from the most previous date.

    So from the example above, the query would return:

    effective_begin_dateaccount

    2007-02-07 10:52:00.0001-086-50-30114-2563

    2007-02-07 10:53:00.0001-086-50-50460-1394

    2007-02-07 10:53:00.0001-086-50-50460-1305

    2007-02-07 10:54:00.0001-086-50-50460-1394

    2007-02-13 10:31:00.0001-086-10-95110-0130

    I've tried several self joins and subqueries, but right now I am so confused, I think if I show what I've tried I would just embarass myself....

    Thanks for your help!

    cpoluru

  • Please, please the next time you post include the table definition, sample input data in a format easy to use. To learn how to do this quickly and easily click on the first link in my signature block

    That said Is this the results that you require?

    CREATE TABLE #T(effective_begin_date DATETIME,Account VARCHAR(20))

    INSERT INTO #T

    SELECT '2007-02-07 10:52:00.000', '1-086-50-30114-2563' UNION ALL

    SELECT '2007-02-07 10:53:00.000', '1-086-50-50460-1394' UNION ALL

    SELECT '2007-02-07 10:53:00.000', '1-086-50-50460-1305' UNION ALL

    SELECT '2007-02-07 10:54:00.000', '1-086-50-50460-1394' UNION ALL

    SELECT '2007-02-08 11:18:00.000', '1-086-50-50460-1394' UNION ALL

    SELECT '2007-02-13 10:31:00.000', '1-086-10-95110-0130'

    ;with numbered as(SELECT rowno=row_number() over

    (partition by Account order by effective_begin_date ),Account,effective_begin_date from #T)

    select * from numbered WHERE rowno =1

    Results:

    rowno Account effective_begin_date

    11-086-10-95110-01302007-02-13 10:31:00.000

    11-086-50-30114-25632007-02-07 10:52:00.000

    11-086-50-50460-13052007-02-07 10:53:00.000

    11-086-50-50460-13942007-02-07 10:53:00.000

    To give you a better grasp of what is happening, execute the SELECT * FROM numbered (notice without the WHERE Clause and you have:

    rownoAccount effective_begin_date

    11-086-10-95110-01302007-02-13 10:31:00.000

    11-086-50-30114-25632007-02-07 10:52:00.000

    11-086-50-50460-13052007-02-07 10:53:00.000

    11-086-50-50460-13942007-02-07 10:53:00.000

    21-086-50-50460-13942007-02-07 10:54:00.000

    31-086-50-50460-13942007-02-08 11:18:00.000

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Thanks so much for your reply. I'm sorry the initial data was not well delivered, I will read your link and improve my post for next time.... That said, can you please send me the link you want me to read and let me know what you mean by the signature block (patience, please, I'm trying :-))

    Unfortunately the results are not exactly what I need. I need:

    2007-02-07 10:52:00.000 1-086-50-30114-2563

    2007-02-07 10:53:00.000 1-086-50-50460-1394

    2007-02-07 10:53:00.000 1-086-50-50460-1305

    2007-02-07 10:54:00.000 1-086-50-50460-1394

    2007-02-13 10:31:00.000 1-086-10-95110-0130

    Note that the account 1-086-50-50460-1394 is listed twice because it changed to 1-086-50-50460-1305 in between 2007-02-07 10:53:00 and 2007-02-07 10:54:00. That is the requirement -- any time an account changes, I need the date and account, regardless of whether that account has already occurred sometime in the past.

    cpoluru

  • A signature is a setting on someone's forum membership that inserts text at the bottom of all their posts.

    He was linking you to this article: http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Here is a solution with a self join. It could probably be tewaked for better performance and I wouldnt be surprised if there's a better way, but it gets the job done:

    CREATE TABLE #T(effective_begin_date DATETIME,Account VARCHAR(20))

    INSERT INTO #T

    SELECT '2007-02-07 10:52:00.000', '1-086-50-30114-2563' UNION ALL

    SELECT '2007-02-07 10:53:00.000', '1-086-50-50460-1394' UNION ALL

    SELECT '2007-02-07 10:53:00.000', '1-086-50-50460-1305' UNION ALL

    SELECT '2007-02-07 10:54:00.000', '1-086-50-50460-1394' UNION ALL

    SELECT '2007-02-08 11:18:00.000', '1-086-50-50460-1394' UNION ALL

    SELECT '2007-02-13 10:31:00.000', '1-086-10-95110-0130'

    ;

    WITH numbered (row_num,effective_begin_date,Account)

    AS (

    SELECT row_num=ROW_NUMBER() over (ORDER BY effective_begin_date),

    effective_begin_date,

    Account

    FROM #T

    )

    SELECT t1.effective_begin_date, t1.account

    FROM numbered t1

    LEFT OUTER JOIN numbered t2 ON t1.row_num = t2.row_num+1 AND t1.Account = t2.Account

    WHERE t2.Account IS NULL

    DROP TABLE #T

    So you get a numbered version of the table to make the join less mess, self outer join each column to its previous column by account number, and all the rows where the joined table is NULL are either the first record, or a record with the same account as the preceding.

  • That is great. Thank you so much. It gets the job done marvelously.

    Incidently, I'm not worried about performance. The main issue is the correct data as we get through a conversion.

    Thanks again to both you and Ron for the help!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply