Checking if a value has been set/changed in week/month from audit table

  • Greetings,

    I'm struggling to find a solution to find when a value is set or changed to specific value (12 in the supplied example) within a time period (month for instance).

    Unfortunately the audit table, holds many different status values, with a new record written any time one the status column values change. It could also start as NULL (not set) then have a value set.

    My code example only shows one status column in the table and it having one value, then changing and the value repeating because another status column would have had it's value changed.

    I tried to do it in a single lag statement but couldn't, due to checking within the time period where value is 12 as it would never cater for value changes outside a month.

    Hopefully my code example explains the situation better than my words. The problem with the code is if the value has only been set within the month and set with a value immediately and not a null the code doesn't work as should return the earliest date if all values within the same month.

    The outer apply find the previous record based on date, then a comparison is made to check where the value is different.

    Can anyone please help?

    Create table #Demo
    (
    UserIdINT
    ,StatusValueINT
    ,UpdateDateDATETIME
    )

    Insert into #Demo
    Values (1, 12, '2019-09-01T07:00:00'),
    (1, 12, '2019-09-11T07:00:00'),
    (1, 13, '2019-08-05T07:00:00')

    Declare @startdatetime datetime = '2019-09-01T07:00:00'
    Declare @enddatetime datetime = '2019-10-01T07:00:00'


    Select d.UserId,
    d.StatusValue,
    d.UpdateDate,
    d2.StatusValue [PrevValue],
    d2.UpdateDate [PrevUpdateDate]
    from #Demo as d
    outer apply
    (
    select d2.statusvalue, d2.UpdateDate
    from #Demo as d2
    where d2.userid = d.userid
    and d2.UpdateDate < d.UpdateDate
    ) as d2
    where d.UpdateDate between '2019-09-01T00:00:00' and '2019-10-01T00:00:00'
    and d.StatusValue = 12
    and d2.StatusValue <> isnull(d.StatusValue, -1)

    Thanks

    • This topic was modified 5 years, 2 months ago by  robtyketto.
    • This topic was modified 5 years, 2 months ago by  robtyketto.
  • If I'm understanding correctly there's a table with many status columns per userid.  When any of the statuses change a new record is inserted which duplicates the unchanged statuses.  Fyi, BETWEEN is inclusive of the range endpoints which I don't think is what you're intending to do here.  If updatedate='2019-10-01T00:00:00' it would be counted as September.  It does seem like using a CROSS APPLY would be more compact, I messed around with it a little, but anyway here's some code that maybe is helpful:

    drop table if exists #demo;
    go
    create table #demo(
    useridint
    ,statusvalueint
    ,updatedatedatetime);
    go

    insert #demo values
    (1, 12, '2018-07-01T07:00:00'),
    (1, 16, '2018-08-01T07:00:00'),
    (1, 10, '2019-08-01T07:00:00'),
    (1, 11, '2019-09-01T07:00:00'),
    (1, 12, '2019-09-04T07:00:00'),
    (1, 12, '2019-09-08T07:00:00'),
    (1, 12, '2019-09-12T07:00:00'),
    (1, 12, '2019-09-11T07:00:00'),
    (1, 13, '2019-08-05T07:00:00'),
    (2, 13, '2019-08-05T07:00:00'),
    (2, 12, '2019-08-05T07:00:00');
    go

    with
    history_cte as(
    select
    userid,
    statusvalue,
    updatedate,
    lag(statusvalue) over(partition by userid order by updatedate) sv_lag
    from
    #demo),
    changes_cte as (
    select
    *
    from
    history_cte
    where
    statusvalue=12
    and (sv_lag<>12 or sv_lag is null)),
    min_mo_cte as (
    select
    userid,
    datepart(year, updatedate) yr_dt,
    datepart(month, updatedate) mo_dt,
    min(updatedate) min_dt
    from
    changes_cte
    group by
    userid,
    datepart(year, updatedate),
    datepart(month, updatedate))
    select * from min_mo_cte;

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thanks, you understood the issue perfectly.

    I did, to simplify the issue not give full details of the audit table. It contains millions of rows and the primary key is an indentifier but has non-clustered index on the userid. Believe attempting a lag against every row would severly impact performance.

    I'm thinking of check where the status is that of 12 as in the example for a month period, this is done via a mechanism of a table that stores the id start and end values for the table on a daily basis. I did this originally but tried to use lag, but this would never work as it was only looking at values within the month period.

    The with the userids I can follow the method you suggested, unfortunatley not at work so don't have the dataset to test with.

    I'm essentially checking

    Where the value has been set or changed to 12 within than month. The previous value could be either NULL (not been set previously), <> 12 (Set previously) or 12 (Set previously).

    Sorry for being a little vague in first post, was trying to keep it as simple as I could.

    Thanks.

    • This reply was modified 5 years, 2 months ago by  robtyketto.
  • select d1.UserId,
    d1.StatusValue,
    d1.UpdateDate,
    d2.StatusValue PrevValue,
    d2.UpdateDate PrevUpdateDate
    from #demo d1
    outer apply(select top(1) *
    from #Demo d2
    where d2.StatusValue <> 12
    and d2.UpdateDate < d1.UpdateDate
    and d2.UserId = d1.UserId
    order by d2.UpdateDate desc) d2
    where d1.StatusValue = 12
  • The top(1) protects from row expansion due to the inequality on date.  That was the part I wasn't getting.

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • robtyketto wrote:

    I'm thinking of check where the status is that of 12 as in the example for a month period, this is done via a mechanism of a table that stores the id start and end values for the table on a daily basis. I did this originally but tried to use lag, but this would never work as it was only looking at values within the month period.

    The with the userids I can follow the method you suggested, unfortunatley not at work so don't have the dataset to test with.

    I'm essentially checking

    Where the value has been set or changed to 12 within than month. The previous value could be either NULL (not been set previously), <> 12 (Set previously) or 12 (Set previously).

    So you're going to maintain a summary table of users by month?  The query here would just be to back populate that table?  Maybe you could stick the "last" (or most recent) statusvalue in the same summary table.  Then piggyback on the change insert to update the summary table too.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Declare @startdatetime datetime = '2019-09-01T07:00:00'
    Declare @enddatetime datetime = '2019-10-01T07:00:00'

    ;with cte as
    (
    Select d.UserId,
    d.StatusValue,
    d.UpdateDate,
    LAG(d.StatusValue, 1, NULL) OVER (PARTITION BY d.UserId ORDER BY d.UpdateDate ASC) [PrevValue],
    LAG(d.UpdateDate, 1 ,NULL) OVER (PARTITION BY d.UserId ORDER BY d.UpdateDate ASC) [PrevUpdateDate]
    from #Demo as d
    )
    Select x.UserId,
    x.StatusValue,
    x.UpdateDate,
    x.[PrevValue],
    x.[PrevUpdateDate]
    from cte x
    where x.StatusValue = 12
    and x.UpdateDate between @startdatetime and @enddatetime
  • Declare @startdatetime datetime = '2019-09-01T07:00:00'
    Declare @enddatetime datetime = '2019-10-01T07:00:00'

    ;with cte as
    (
    Select d.UserId,
    d.StatusValue,
    d.UpdateDate,
    ROW_NUMBER() OVER (PARTITION BY d.UserId ORDER BY d.UpdateDate ASC) [RowNum]
    from #Demo as d
    )
    Select a.UserId,
    a.StatusValue,
    a.UpdateDate,
    b.StatusValue [PrevValue],
    b.UpdateDate [PrevUpdateDate]
    from cte a
    left join cte b
    on b.UserId = a.UserId
    and b.RowNum = a.RowNum - 1
    where a.StatusValue = 12
    and a.UpdateDate between @startdatetime and @enddatetime
  • Thanks for all the replies, will have to review them all in more detail.

    Insert into #Demo

    Values (1, 12, '2019-09-01T07:00:00'),

    (1, 12, '2019-09-11T07:00:00'),

    (1, 12, '2019-08-05T07:00:00')

    The above should return 0 rows as the value 12 was first set in August and not September.

    I will also need to do more on ranking on the candidate set , so if the value was from 11, 12, 11 and back to 12 then details are for the last entry withint the date criteria the month of September. Sorry I keep adding to the original scenerio, slipped my mind at time of posting.

    Thanks

  • what if it was the following?

     

    Values (1, 12, '2019-09-01T07:00:00')
    , (1, 12, '2019-09-11T07:00:00')
    , (1, 11, '2019-08-15T07:00:00') -- one 11 in August
    , (1, 12, '2019-08-05T07:00:00')
  • Thanks for the replies, in the case above it's where the value has changed to a 12 or been set to a 12 within the month specified.

    So I would like to return the '2019-09-01T07:00:00' time, because it was initially 12 then changed to 11 and then changed to 12 again in the specifed month of September.  I wouldnt want the '2019-09-11T07:00:00' as the value hadn't changed but another value in the table had.

     

     

     

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply