Query Help

  • I'm sure one of you gurus can help me figure this out.  I just can't get it.

    I've got a table that looks like the following:

    recnbrmynbrmydateempid
    101/1/2001bob
    2NULL1/2/2001bob
    3NULL1/7/2001bob
    4NULL1/8/2001bob
    5NULL1/21/2001bob
    608/4/2003cindy
    7NULL8/5/2003cindy
    8NULL9/1/2003cindy
    9NULL9/8/2003cindy
    10NULL9/15/2003cindy
    1103/1/2004terry
    12NULL3/3/2004terry
    13NULL3/4/2004terry
    14NULL3/4/2004terry
    15NULL3/20/2004terry

    I want to update the mynbr field to equal the difference in the number of days between the current mydate and the previous record's mydate plus the previous record's mynbr value.  I want these numbers to be specific to each empid.  The resulting table would look like the following:

    recnbrmynbrmydateempid
    101/1/2001bob
    211/2/2001bob
    361/7/2001bob
    471/8/2001bob
    5201/21/2001bob
    608/4/2003cindy
    718/5/2003cindy
    8289/1/2003cindy
    9359/8/2003cindy
    10429/15/2003cindy
    1103/1/2004terry
    1223/3/2004terry
    1333/4/2004terry
    1443/5/2004terry
    15193/20/2004

    terry

    I'm trying to find the fastest way to do this because I may have 100,000 records or more.  This is not a lot of records but it becomes a lot if you have to iterate through all of the records.  I'm trying to find a way to do this with one query, without any loops.

    Anybody got any ideas?

    Thanks,

    hawg

    ----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011

  • Someone may know better than I, but I think you are looking at a stored procedure for that one. You might be surprised how fast it could run.

    You would need a cursor to get the prior record (based off recnbr I assume) to be sure no records had been dropped leaving you with holes in your sequence.

    You could also consider writing the stored procedure to do one row and execute as a trigger whenever a row is inserted, updated, or deleted in your table. You would need to operate on all those events since any change in sequence would affect your whole outcome. This seems true even if your sequence was based on the date instead of recnumber.

    However, I'm guessing there might be a better way to accomplish your end goal whatever it is.

  • I don't usually recommend this, but I'd use a cursor for this. Assuming that recno is an actual field in your table, use that and run a loop. Actually you don't need to do that, but here's  a couple things to consider.

    1. Don't try this in raw sql as one batch. Doing this many records causes a lot of transactions, locking, and may take a long time to complete, or rollback. http://www.sqlservercentral.com/columnists/sjones/batching.asp. Instead, use batches of rows or a cursor and multiple transactions.

    2. I appears you can cursor on the empid. I might run a

    select * from table order by empid, mydate

    then keep track of the empid and as long as it's the same, calculate the time as

    select datediff( d, a.mydate, b.mydate)

     from mytable a

     inner join mytable b

      where a.empid = b.empid

     and b.mydate = ( select max( c.mydate) from mytable c where c.empid = a.empid and c.mydate < a.mydate)

    use that for the update.

    3. Test on small batches, not the whole table or a copy of it. And be sure to test the boundary conidtions, only one row for an emp, etc.

    Good luck

     

  • Am I missing something here?  

    what is wrong with this:

    Update o Set mynbr = datediff(d,(select min(mydate) from testt t where t.empid = o.Empid ),o.mydate)

    from testt o

    Tests:

    create table testt (recnbr int , mynbr int, mydate datetime, empid varchar(50))

    go

    insert into testt (recnbr, mynbr,mydate,empid) Values (1, 0 , '1/1/2001', 'bob')

    insert into testt (recnbr, mynbr,mydate,empid) Values (2, NULL , '1/2/2001', 'bob') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (3, NULL , '1/7/2001', 'bob') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (4, NULL , '1/8/2001', 'bob') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (5, NULL , '1/21/2001', 'bob') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (6, 0 , '8/4/2003', 'cindy') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (7, NULL , '8/5/2003', 'cindy') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (8, NULL , '9/1/2003', 'cindy') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (9, NULL , '9/8/2003', 'cindy') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (10, NULL, '9/15/2003', 'cindy') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (11, 0, '3/1/2004', 'terry') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (12, NULL, ' 3/3/2004', 'terry') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (13, NULL, ' 3/4/2004', 'terry') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (14, NULL, ' 3/4/2004', 'terry') 

    insert into testt (recnbr, mynbr,mydate,empid) Values (15, NULL, '3/20/2004', 'terry')

     

    Update o Set mynbr = datediff(d,(select min(mydate) from testt t where t.empid = o.Empid ),o.mydate)

    from testt o

    select * from testt

    recnbr mynbr       mydate                                                         empid

    ------ -----        -----------------------                                  ------

    1           0           2001-01-01 00:00:00.000                                bob

    2           1           2001-01-02 00:00:00.000                                bob

    3           6           2001-01-07 00:00:00.000                                bob

    4           7           2001-01-08 00:00:00.000                                bob

    5           20          2001-01-21 00:00:00.000                                bob

    6           0           2003-08-04 00:00:00.000                                cindy

    7           1           2003-08-05 00:00:00.000                                cindy

    8           28          2003-09-01 00:00:00.000                                cindy

    9           35          2003-09-08 00:00:00.000                                cindy

    10          42          2003-09-15 00:00:00.000                                cindy

    11          0           2004-03-01 00:00:00.000                                terry

    12          2           2004-03-03 00:00:00.000                                terry

    13          3           2004-03-04 00:00:00.000                                terry

    14          3           2004-03-04 00:00:00.000                                terry

    15          19          2004-03-20 00:00:00.000                                terry

    (15 row(s) affected)

     


    * Noel

  • noeld, your solution has produced the correct answer this time but is assuming properties of the data that we don't know for sure - with the sample data shown it is true that mynbr ends up as the number of days from the employee's first date to mydate, but that is not the rule we are asked to follow.

    (Original poster might want to clarify here! - if noeld's simpler statement about your data is true then his solution will be fine)

    A query is possible that implements the rule as stated (update the mynbr field to equal the difference in the number of days between the current mydate and the previous record's mydate plus the previous record's mynbr value), it uses a correlated subquery to find, for each row, the previous (in date order) row for that emp:

    update mytable T1
    set mynbr = 
    -- number of days since previous mydate for this emp
    -- previous mydate = highest mydate that is strictly less than this mydate
    datediff(d, (select max(mydate) from mytable T2 where T2.empid = T1.empid and T2.mydate < T1.mydate), mydate)
    +
    -- add on mynbr from previous record (in mydate order)
    (select mynbr from mytable T3 where T3.empid = T1.empid
    and T3.mydate = (select max(mydate) from mytable T4 where T4.empid = T1.empid and T4.mydate < T1.mydate) )

    (I like to alias each reference to the same table differnetly, it's a personal foible, don't know how necessary it is ) Notice how we identify the 'previous' record twice, in the same way each time - first to get its mydate, then to get its mynbr.

    Oh, and I leave as an exercise for the reader to work out what happens in boundary cases...

     

  • AKM,

    What the poster said was NOT what the numbers example was showing. I followed the numbers example and the post should have been the difference between the FIRST date of the employee entry and the current date So my example Still holds. If you run your query you are NOT going to get the same results than what the Numbers example showed, therefore is NOT a coincidence either   

    Hope is clear. 


    * Noel

  • Noeld's solution seems to be fine - there is just one question about the word "previous" in original post.

    In case the records are not always ordered by date ascending, and the "previous" means "that with nearest lower value of recnbr", Noeld's query wouldn't work. But I can hardly imagine why anyone should want to (obviously) count a number of days from first occurrence, and then mess it all up by adding some value from some "previous" record - so it is 99% OK. I hope hawg willl confirm that we understood everything well, or elaborate a bit on what the query means and how it should work.

    cheers, Vladan

  • Thank you all for your help and my apologies for not being clear.  Thank you especially, noeld.  Your answer does work in my case. 

    I had just been so close to the problem for so long that I got tunnel vision and could not see outside of the solution I was trying to create.

    In my data, it is a fact that the data will come in by empid and in mydate, ascending, order.  The whole purpose was to get the number of days from the first date for each particular employee.  I wasn't thinking in that way and I messed everyone up by giving an insight into how I was trying to solve the problem.

    The recnbr field is really of no consequence now.  It was there from an earlier attempt to solve this problem.

    Again, my apologies for the confusion but I appreciate everyone's help and time.  I know I can always get great help from this community and I'll continue to do my best to provide help when I can.

    Thanks again!

    ----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply