Query Help

Question

Post reply

Query Help

hawg

SSCertifiable

Points: 6044

July 26, 2004 at 7:21 am

#163020

I'm sure one of you gurus can help me figure this out. I just can't get it.

I've got a table that looks like the following:

recnbr	mynbr	mydate	empid
1	0	1/1/2001	bob
2	NULL	1/2/2001	bob
3	NULL	1/7/2001	bob
4	NULL	1/8/2001	bob
5	NULL	1/21/2001	bob
6	0	8/4/2003	cindy
7	NULL	8/5/2003	cindy
8	NULL	9/1/2003	cindy
9	NULL	9/8/2003	cindy
10	NULL	9/15/2003	cindy
11	0	3/1/2004	terry
12	NULL	3/3/2004	terry
13	NULL	3/4/2004	terry
14	NULL	3/4/2004	terry
15	NULL	3/20/2004	terry

I want to update the mynbr field to equal the difference in the number of days between the current mydate and the previous record's mydate plus the previous record's mynbr value. I want these numbers to be specific to each empid. The resulting table would look like the following:

recnbr	mynbr	mydate	empid
1	0	1/1/2001	bob
2	1	1/2/2001	bob
3	6	1/7/2001	bob
4	7	1/8/2001	bob
5	20	1/21/2001	bob
6	0	8/4/2003	cindy
7	1	8/5/2003	cindy
8	28	9/1/2003	cindy
9	35	9/8/2003	cindy
10	42	9/15/2003	cindy
11	0	3/1/2004	terry
12	2	3/3/2004	terry
13	3	3/4/2004	terry
14	4	3/5/2004	terry
15	19	3/20/2004	terry

I'm trying to find the fastest way to do this because I may have 100,000 records or more. This is not a lot of records but it becomes a lot if you have to iterate through all of the records. I'm trying to find a way to do this with one query, without any loops.

Anybody got any ideas?

Thanks,

hawg

----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Aaron Myers Ten Centuries Points: 1139 More actions · Answer 1

Someone may know better than I, but I think you are looking at a stored procedure for that one. You might be surprised how fast it could run.

You would need a cursor to get the prior record (based off recnbr I assume) to be sure no records had been dropped leaving you with holes in your sequence.

You could also consider writing the stored procedure to do one row and execute as a trigger whenever a row is inserted, updated, or deleted in your table. You would need to operate on all those events since any change in sequence would affect your whole outcome. This seems true even if your sequence was based on the date instead of recnumber.

However, I'm guessing there might be a better way to accomplish your end goal whatever it is.

Steve Jones - SSC Editor SSC Guru Points: 736494 More actions · Answer 2

I don't usually recommend this, but I'd use a cursor for this. Assuming that recno is an actual field in your table, use that and run a loop. Actually you don't need to do that, but here's a couple things to consider.

1. Don't try this in raw sql as one batch. Doing this many records causes a lot of transactions, locking, and may take a long time to complete, or rollback. http://www.sqlservercentral.com/columnists/sjones/batching.asp. Instead, use batches of rows or a cursor and multiple transactions.

2. I appears you can cursor on the empid. I might run a

select * from table order by empid, mydate

then keep track of the empid and as long as it's the same, calculate the time as

select datediff( d, a.mydate, b.mydate)

from mytable a

inner join mytable b

where a.empid = b.empid

and b.mydate = ( select max( c.mydate) from mytable c where c.empid = a.empid and c.mydate < a.mydate)

use that for the update.

3. Test on small batches, not the whole table or a copy of it. And be sure to test the boundary conidtions, only one row for an emp, etc.

Good luck

noeld SSC Guru Points: 96590 More actions · Answer 3

Am I missing something here?

what is wrong with this:

Update o Set mynbr = datediff(d,(select min(mydate) from testt t where t.empid = o.Empid ),o.mydate)

from testt o

Tests:

create table testt (recnbr int , mynbr int, mydate datetime, empid varchar(50))

go

insert into testt (recnbr, mynbr,mydate,empid) Values (1, 0 , '1/1/2001', 'bob')

insert into testt (recnbr, mynbr,mydate,empid) Values (2, NULL , '1/2/2001', 'bob')

insert into testt (recnbr, mynbr,mydate,empid) Values (3, NULL , '1/7/2001', 'bob')

insert into testt (recnbr, mynbr,mydate,empid) Values (4, NULL , '1/8/2001', 'bob')

insert into testt (recnbr, mynbr,mydate,empid) Values (5, NULL , '1/21/2001', 'bob')

insert into testt (recnbr, mynbr,mydate,empid) Values (6, 0 , '8/4/2003', 'cindy')

insert into testt (recnbr, mynbr,mydate,empid) Values (7, NULL , '8/5/2003', 'cindy')

insert into testt (recnbr, mynbr,mydate,empid) Values (8, NULL , '9/1/2003', 'cindy')

insert into testt (recnbr, mynbr,mydate,empid) Values (9, NULL , '9/8/2003', 'cindy')

insert into testt (recnbr, mynbr,mydate,empid) Values (10, NULL, '9/15/2003', 'cindy')

insert into testt (recnbr, mynbr,mydate,empid) Values (11, 0, '3/1/2004', 'terry')

insert into testt (recnbr, mynbr,mydate,empid) Values (12, NULL, ' 3/3/2004', 'terry')

insert into testt (recnbr, mynbr,mydate,empid) Values (13, NULL, ' 3/4/2004', 'terry')

insert into testt (recnbr, mynbr,mydate,empid) Values (14, NULL, ' 3/4/2004', 'terry')

insert into testt (recnbr, mynbr,mydate,empid) Values (15, NULL, '3/20/2004', 'terry')

Update o Set mynbr = datediff(d,(select min(mydate) from testt t where t.empid = o.Empid ),o.mydate)

from testt o

select * from testt

recnbr mynbr mydate empid

------ ----- ----------------------- ------

1 0 2001-01-01 00:00:00.000 bob

2 1 2001-01-02 00:00:00.000 bob

3 6 2001-01-07 00:00:00.000 bob

4 7 2001-01-08 00:00:00.000 bob

5 20 2001-01-21 00:00:00.000 bob

6 0 2003-08-04 00:00:00.000 cindy

7 1 2003-08-05 00:00:00.000 cindy

8 28 2003-09-01 00:00:00.000 cindy

9 35 2003-09-08 00:00:00.000 cindy

10 42 2003-09-15 00:00:00.000 cindy

11 0 2004-03-01 00:00:00.000 terry

12 2 2004-03-03 00:00:00.000 terry

13 3 2004-03-04 00:00:00.000 terry

14 3 2004-03-04 00:00:00.000 terry

15 19 2004-03-20 00:00:00.000 terry

(15 row(s) affected)

* Noel

AKM SSCommitted Points: 1538 More actions · Answer 4

noeld, your solution has produced the correct answer this time but is assuming properties of the data that we don't know for sure - with the sample data shown it is true that mynbr ends up as the number of days from the employee's first date to mydate, but that is not the rule we are asked to follow.

(Original poster might want to clarify here! - if noeld's simpler statement about your data is true then his solution will be fine)

A query is possible that implements the rule as stated (update the mynbr field to equal the difference in the number of days between the current mydate and the previous record's mydate plus the previous record's mynbr value), it uses a correlated subquery to find, for each row, the previous (in date order) row for that emp:

update mytable T1

set mynbr =

-- number of days since previous mydate for this emp

-- previous mydate = highest mydate that is strictly less than this mydate

datediff(d, (select max(mydate) from mytable T2 where T2.empid = T1.empid and T2.mydate < T1.mydate), mydate)

-- add on mynbr from previous record (in mydate order)

(select mynbr from mytable T3 where T3.empid = T1.empid

and T3.mydate = (select max(mydate) from mytable T4 where T4.empid = T1.empid and T4.mydate < T1.mydate) )

(I like to alias each reference to the same table differnetly, it's a personal foible, don't know how necessary it is ) Notice how we identify the 'previous' record twice, in the same way each time - first to get its mydate, then to get its mynbr.

Oh, and I leave as an exercise for the reader to work out what happens in boundary cases...

noeld SSC Guru Points: 96590 More actions · Answer 5

AKM,

What the poster said was NOT what the numbers example was showing. I followed the numbers example and the post should have been the difference between the FIRST date of the employee entry and the current date So my example Still holds. If you run your query you are NOT going to get the same results than what the Numbers example showed, therefore is NOT a coincidence either

Hope is clear.

* Noel

Vladan SSC-Insane Points: 22072 More actions · Answer 6

Noeld's solution seems to be fine - there is just one question about the word "previous" in original post.

In case the records are not always ordered by date ascending, and the "previous" means "that with nearest lower value of recnbr", Noeld's query wouldn't work. But I can hardly imagine why anyone should want to (obviously) count a number of days from first occurrence, and then mess it all up by adding some value from some "previous" record - so it is 99% OK. I hope hawg willl confirm that we understood everything well, or elaborate a bit on what the query means and how it should work.

cheers, Vladan

hawg SSCertifiable Points: 6044 More actions · Answer 7

Thank you all for your help and my apologies for not being clear. Thank you especially, noeld. Your answer does work in my case.

I had just been so close to the problem for so long that I got tunnel vision and could not see outside of the solution I was trying to create.

In my data, it is a fact that the data will come in by empid and in mydate, ascending, order. The whole purpose was to get the number of days from the first date for each particular employee. I wasn't thinking in that way and I messed everyone up by giving an insight into how I was trying to solve the problem.

The recnbr field is really of no consequence now. It was there from an earlier attempt to solve this problem.

Again, my apologies for the confusion but I appreciate everyone's help and time. I know I can always get great help from this community and I'll continue to do my best to provide help when I can.

Thanks again!

----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011