June 18, 2007 at 8:25 am
Hi,
I've been pondering this for the past couple of days and am getting nowhere fast.
I have a denormalised table that contains a number of rows for a number of clients. Each client has one or more agreements that may or may not form a continuous chain of events - to be included in a chain the start date of one agreement must be the day after the previous agreement ended. I have managed to generate this information without too much trouble. However, I now want to be able to identify those agreements that can be linked together by giving them an identifier. I can do this on a row by row basis but it takes a long time to run (we are talking about a table with around 500k rows. Can anyone think of a rasonably simple way of running this within a single query?
The data looks something like this:
PerId AdeId dtStart dtEnd AdeFrom AdeTo Linked
1 1 2004-01-01 2004-03-31 NULL 2 1
1 2 2004-04-01 2004-05-01 1 3 1
1 3 2004-05-02 2004-05-22 2 NULL 1
1 4 2004-06-01 2004-06-06 NULL 5 2
1 5 2004-06-07 2004-07-15 4 NULL 2
1 6 2004-06-21 NULL NULL NULL 3
I don't know if this is clear - what I have is a single PerId (ie A person) A group of agreements, the start and end dates for each agreement and the chain (so agreeement 1 is followed by agreement 2, the chain ends with Agreeement 3. A new chain starts with agreeement 4 and ends with agreement 5 and so on).
Any thoughts much appreciated.
S
June 18, 2007 at 8:44 am
Stuart... some more of your ddl would be useful... along with what you expect as a result...
Mathew J Kulangara
sqladventures.blogspot.com
June 18, 2007 at 8:44 am
From your post, I can't tell if the data you are showing is your data or the result you are looking for. What would help is the DDL for your table(s) and some sample data and sample results. I would show more than one person, however, for additional clarity of what you are looking for.
June 18, 2007 at 8:58 am
sorry,
That was the expected result set - I am trying to generate the final (Linked) column. This is very much a small result set - as I say, we are looking at 500k rows (between 10 and 200 agreements per client).
So for example to achieve this dataset, I am using the following update statement (on my existing table:
update a
set a.AdeTo = c.AdeId,
a.AdeFrom = b.AdeId
from myTable a
left join myTable b on a.perid = b.perid and datediff(dd,a.dtstart,b.dtEnd) = -1 left join myTable c on a.perid = c.perid and datediff(dd,a.dtEnd,c.dtStart) = 1
and datediff(dd,c.dtStart,ISNULL(c.dtEnd,dateadd(dd,2,c.dtstart))) > 1
where b.adeid is not null or c.adeid is not null
Here is a slightly expanded sample result set - using a second client (note that the Linked counter resets for each client
PerId AdeId dtStart dtAnd AdeFrom AdeTo Linked
1 1 2004-04-01 2004-03-31 NULL 2 1
1 2 2004-04-01 2004-05-01 1 3 1
1 3 2004-05-02 2004-05-22 2 NULL 1
1 4 2004-06-01 2004-06-06 NULL 5 2
1 5 2004-06-07 2004-07-15 4 NULL 2
1 6 2004-06-21 NULL NULL NULL 3
2 7 2004-04-01 2004-03-31 NULL 8 1
2 8 2004-04-01 2004-05-01 7 9 1
2 9 2004-05-02 2004-05-22 8 NULL 1
June 19, 2007 at 10:13 am
Hi Stewart
Peter Larsson recently posted an elegant solution to this problem, see
http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=367397&p=4
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 3, 2007 at 2:18 am
Hi,
Thanks for that, it helped me a lot. Sorry for the tardy acknowledgement.
S
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply