Retrieve list of IDs for records meeting criteria

  • Here's the problem.

    I have a table that stores history for contacts. The important parts of the structure look like this:

    ID (int)

    Canned (varchar)

    RealTime (datetime)

    I need to get a list of IDs from this table where the last 5 records (as determined by RealTime) for each ID are set to a specific canned value ('fOff'). I can't figure out how to do this without using cursors (which I'd rather not do). The DB is SQL Server 2000 if that matters... Any help would be greatly appreciated.



    -Brandon

  • Not quite sure what you need, does this get you started?

     

    use Northwind

    select CustomerID

    from

     (select CustomerID, Count(OrderDate) AS Instances

     from Orders

     where OrderDate > '8/15/96'

     group by CustomerID)

     AS CustomerOccurrences

    where Instances >= 5



    Everett Wilson
    ewilson10@yahoo.com

  • Not quite. Assume the following for sample data: (RealTime has greater precision than this data. It's a datetime field. I'm just lazy.)

    ID | Canned | RealTime

    ----------------------

    1 | fOff | 9/14/04

    1 | fOff | 9/13/04

    1 | fOff | 9/12/04

    1 | fOff | 9/11/04

    1 | fOff | 9/10/04

    2 | fOff | 9/14/04

    2 | fOff | 9/13/04

    2 | bob | 9/12/04

    2 | fOff | 9/11/04

    2 | fOff | 9/10/04

    2 | fOff | 9/09/04

    The query I'm looking for would return ID 1 (because the last 5 records are all Canned = 'fOff') , but not ID 2 (since there's that pesky 'bob' record in the middle).

    The pseudo-code I have in mind looks something like this (this is geared for programming not SQL, sorry):

    ForEach ID In History

    Set = Select Top 5 Canned from History Where ID = ID

    ForEach Record In Set

    If Canned 'fOff' Then

    Process Next ID -- Don't add to list

    End If

    Next

    Add ID to List

    Next

    It would be straightforward to do with a cursor, but I'm trying to stay away from that solution.

    Thanks.



    -Brandon

  • how about identify 'bob' and then remove him from the final list, sorry in advance for any syntax errors.

    select *

    from xtable

      left join

     (select ID

     from xtable

     where Canned = 'bob')

     as xbob

        on xtable.id = xbob.id

    where xbob.id is null



    Everett Wilson
    ewilson10@yahoo.com

  • That still doesn't help with determining what the last 5 records for an ID are. The history table may have numerous records per ID (it's around a million rows total size) and there may be many different Canned values (~60 distinct values) in the records for each ID. I only want the IDs where the last 5 records (as determined by Realtime) in the history table where Canned = 'fOff'. If there are any other values in the last 5 records for an ID, I don't want that ID in my list.



    -Brandon

  • If each ID has a consecutive dates in the realtime column, you could do something like:

    select id                                                                                                                                                 from (select id, realtime from history where realtime>dateadd(day,-5,getdate()) and canned='fOff') a                    group by id having count(*)=5

    Otherwise, you could do something like the following:

    select h.id, canned, h.realtime

    into #a

    from history h, (select id, max(realtime) realtime from history group by id) a

    where h.id=a.id

    and h.realtime=a.realtime

    select h.id, canned, h.realtime

    into #b

    from history h, (select b.id, max(b.realtime) realtime from history b, #a a where a.id=b.id and a.realtime>b.realtime group by b.id) a

    where h.id=a.id

    and h.realtime=a.realtime

    select h.id, canned, h.realtime

    into #c

    from history h, (select b.id, max(b.realtime) realtime from history b, #b a where a.id=b.id and a.realtime>b.realtime group by b.id) a

    where h.id=a.id

    and h.realtime=a.realtime

    select h.id, canned, h.realtime

    into #d

    from history h, (select b.id, max(b.realtime) realtime from history b, #c a where a.id=b.id and a.realtime>b.realtime group by b.id) a

    where h.id=a.id

    and h.realtime=a.realtime

    select h.id, canned, h.realtime

    into #e

    from history h, (select b.id, max(b.realtime) realtime from history b, #d a where a.id=b.id and a.realtime>b.realtime group by b.id) a

    where h.id=a.id

    and h.realtime=a.realtime

    select id

    from (

    select id, canned from #a

    union

    select id, canned from #b

    union

    select id, canned from #c

    union

    select id, canned from #d

    union

    select id, canned from #e

    ) a

    where canned='fOff'

    group by id having count(*)=5

    Hope this helps.

    Brian

  • Or, you could add a trigger to the history table that will maintain a table for the last five inserted records for each id.

    create table test (ID int, canned varchar(10), realtime datetime)

    create table test5 (ID int, canned varchar(10), realtime datetime)

    ALTER trigger test_last5 on test

    for insert

    as

    INSERT INTO test5 (id, canned, realtime)

    select id, canned, realtime from inserted

    IF (select count(*) from inserted i, test5 a where i.id=a.id)>5

    begin

    delete a

    from test5 a, (select b.id, min(b.realtime) realtime from inserted i, test5 b where i.id=b.id group by b.id) v

    where a.id=v.id

    and a.realtime=v.realtime

    end

    However, this approach won't work if you delete from the history table.

    Brian

  • Brian,

    Thanks for the help. I ended up using your first solution. It's a little slow (little over a minute to run), but since it only needs to run once a week or so, not too big of a problem. I'm not sure if anything is ever deleted from the history table, I don't think so, but I wouldn't swear to it.



    -Brandon

  • I would be interested to see how the performance of this solution compares; it also does not rely on there existing consecutive dates for each ID.

    I am going to recast the requirements to explain my query. The IDs we want are those IDs with Canned=fOff for the records with the five greatest dates. Another way of saying this is that we want the IDs where the highest non-fOff Canned date is not one of the highest five dates for that ID. So in T-SQL:

    select top 5 RealTime 
    from History 
    where ID = @something 
    order by RealTime desc

    is the expression that gives us the highest five dates for an ID;

    select max(RealTime) 
    from History 
    where ID = @something and Canned <> 'fOff'

    gives the highest non-fOff date (or NULL if there are no non-fOff dates).

    Putting it all together we get

    select ID from History H
    where 
    (select max(RealTime) from History H2 where H2.ID=H.ID and Canned<>'fOff')
    not in
    (select top 5 RealTime from History H3 where H3.ID=H.ID order by RealTime desc)

    Note that as well as not depending on any pattern to the RealTimes, this solution also allows the '5' criterion to ary easily, should business requirements change.

    You will have to check how the performance is

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply