Get Unique Rows

  • Hi

    I have a table that has a lot of data in it, the data is updated very frequently. In short the table looks something like below,

    ID Identity(1,1)

    GroupID int

    Alias varchar(50)

    DateStamp DateTime

    what would be the best way to get only the latest row for each GroupID, I thought of putting in a IsActive column and when a new row is added it switches the currently active row to IsActive false. but at the rate that the data comes in this seems to be to much overhead.

    Any suggestions would be great 😀


    The Fastest Methods aren't always the Quickest Methods

  • Can you explain on what you mean by getting the lastest row? Are records updated in this table too or are they strictly insert only records? If/When records are updated, is the DateStamp field updated as well? When records are inserted, is the DateStamp field populated? What indexes are on the table?

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • by latest row I mean that it will look at the DateStamp and get only the row with the newest DateStamp for each group


    The Fastest Methods aren't always the Quickest Methods

  • Then you can

    select GroupID, max(DateStamp)

    from table

    group by GroupID

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • will that give me the exact row or just the latest DateStamp? sorry I'm not the best at explaining things. Would the below work well?

    SELECT GroupID,Name, Position

    FROM MyTable

    WHERE DateStamp = MAX(DateStamp)

    GROUP BY GroupID,Name, Position

    there is going to be a very large amount of data in this table always being updated, so will this be slow? not sure if the above will even work but my idea is there I think


    The Fastest Methods aren't always the Quickest Methods

  • For better answers on performance questions, click on the following...

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • reading through that link now 🙂


    The Fastest Methods aren't always the Quickest Methods

  • If you want to see all of the columns, I think a ranking function would be better.

    Something like this:

    with ranked

    as

    (

    select *, RANK () over (partition by groupid order by datestamp desc) as rank

    from @t

    )

    select *

    from ranked

    where RANK = 1

    Note that this SQL will return more than 1 record for a group if the records have exactly the same value in the datestamp column.

  • thank you 🙂


    The Fastest Methods aren't always the Quickest Methods

  • This would get you the whole row.

    select MyTable.*

    from MyTable

    join

    (

    select GroupID, max(DateStamp) as DateStamp

    from MyTable

    group by GroupID

    ) groupList on groupList.GroupID = MyTable.GroupID and groupList.DateStamp = MyTable.DateStamp

    order by MyTable.GroupID

    Depending on the size of this table it could get slow. That of course is an animal all by itself.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • BinaryDigit (5/19/2011)


    thank you 🙂

    I still don't see anything that won't run a scan of some sort... still waiting on table script and sample data.

  • Great point about not seeing anything yet that won't produce a scan. I started hammering away and tried some indexes when it hit me that there isn't any filter criteria (without which I get a scan!)

  • heb1014 (5/19/2011)


    Great point about not seeing anything yet that won't produce a scan. I started hammering away and tried some indexes when it hit me that there isn't any filter criteria (without which I get a scan!)

    The best I could do at this point is put a covering index on group + date desc. Still would do a scan + lookup but if the table is large and tones of rows that could help.

  • Yeah, that's exactly what I tried too!

    create index idx_t1 on t (groupid, datestamp desc) include (alias)

    Maybe if there was some criteria that said beyond a certain time frame we wouldn't consider a group. Then maybe a filtered index or partitioned table could help.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply