August 26, 2011 at 7:43 am
I have a table (example testing code below) that has a number of records grouped for a particular individual (identified by the first two fields of the table) over a number of dates. This contains the address information at that particular date.
My problem is I need to build up an address history of the start and end date that person was at a particular address. In other words, I need to look at the table in date order for an individual and look for when the address next changes. My desired result is in the address change spreadsheet attached i.e. I need a "From date" and "to date" for each distinct address, with the last record having a blank to date as that's the current address.
How would I go about this please?
create table #temp (
[RecNum] [nvarchar](6),
[RecInd] int,
[Address1] nvarchar(30),
[DateApplied] [datetime] NOT NULL
) ON [PRIMARY]
Insert #temp
select recnum = '100001', 1,'10 Acacia Avenue','01 sep 2010'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 Oct 2010'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 Nov 2010'
union all
select recnum = '100001', 1,'11 Acacia Avenue','01 dec 2010'
union all
select recnum = '100001', 1,'11 Acacia Avenue','01 jan 2011'
union all
select recnum = '100001', 1,'10 Banana Street','01 feb 2011'
union all
select recnum = '100001', 1,'10 Banana Street','01 mar 2011'
union all
select recnum = '100001', 1,'10 Banana Street','01 apr 2011'
union all
select recnum = '100025', 0,'10 Downing Street','01 jan 2011'
union all
select recnum = '100025', 0,'1 Pennsylvania Avenue','01 feb 2011'
August 26, 2011 at 8:03 am
What is the relationship between RecNum and RecInd? I've left out RecInd in the solution below because it looks redundant.
The first thing to do is to group by RecNum and Address1 - this will give you the date on which each person moved into each address:
SELECT RecNum, Address1, MIN(DateApplied) AS [From Date]
FROM #temp
GROUP BY RecNum, Address1
ORDER BY RecNum, MIN(DateApplied)
Now, since the To Date is just the From Date from the previous row where RecInd is the same, or blank (NULL) where it isn't, I think I would leave it to my presentation layer to add that column to my result set. If you insist on doing it in T-SQL, you could number your rows (partitioned by RecNum) and do a self-join on RecNum = RecNum and RowNo = RowNo+1. Have a read about ROW_NUMBER to find out how to number your rows.
John
August 26, 2011 at 8:33 am
John Mitchell-245523 (8/26/2011)
What is the relationship between RecNum and RecInd? I've left out RecInd in the solution below because it looks redundant.The first thing to do is to group by RecNum and Address1 - this will give you the date on which each person moved into each address:
SELECT RecNum, Address1, MIN(DateApplied) AS [From Date]
FROM #temp
GROUP BY RecNum, Address1
ORDER BY RecNum, MIN(DateApplied)
Now, since the To Date is just the From Date from the previous row where RecInd is the same, or blank (NULL) where it isn't, I think I would leave it to my presentation layer to add that column to my result set. If you insist on doing it in T-SQL, you could number your rows (partitioned by RecNum) and do a self-join on RecNum = RecNum and RowNo = RowNo+1. Have a read about ROW_NUMBER to find out how to number your rows.
John
If you think of RecNum as being a customer but a customer can me Mr Bloggs and Mrs Bloggs. RecInd is the identifier for the "party" i.e. RecInd 0 would be Mr Bloggs and RecInd 1 would be Mrs Bloggs (they may have split up for example).
Mmm. I'll see if I can work that out.
August 26, 2011 at 9:19 am
Griffster (8/26/2011)
John Mitchell-245523 (8/26/2011)
What is the relationship between RecNum and RecInd? I've left out RecInd in the solution below because it looks redundant.The first thing to do is to group by RecNum and Address1 - this will give you the date on which each person moved into each address:
SELECT RecNum, Address1, MIN(DateApplied) AS [From Date]
FROM #temp
GROUP BY RecNum, Address1
ORDER BY RecNum, MIN(DateApplied)
Now, since the To Date is just the From Date from the previous row where RecInd is the same, or blank (NULL) where it isn't, I think I would leave it to my presentation layer to add that column to my result set. If you insist on doing it in T-SQL, you could number your rows (partitioned by RecNum) and do a self-join on RecNum = RecNum and RowNo = RowNo+1. Have a read about ROW_NUMBER to find out how to number your rows.
John
If you think of RecNum as being a customer but a customer can me Mr Bloggs and Mrs Bloggs. RecInd is the identifier for the "party" i.e. RecInd 0 would be Mr Bloggs and RecInd 1 would be Mrs Bloggs (they may have split up for example).
Mmm. I'll see if I can work that out.
Just thought, this won't work if the person moved away from an address to a different address, and then returned to the first address a while later. See what I mean?
August 30, 2011 at 1:45 am
Possibly. It works for the sample data you supplied. Please will you supply some data for which it won't work, and I'll take a look?
John
September 1, 2011 at 5:42 am
John Mitchell-245523 (8/30/2011)
Possibly. It works for the sample data you supplied. Please will you supply some data for which it won't work, and I'll take a look?John
Here you go...
create table #temp (
[RecNum] [nvarchar](6),
[RecInd] int,
[Address1] nvarchar(30),
[DateApplied] [datetime] NOT NULL )
ON [PRIMARY]
Insert #tempselect recnum = '100001', 1,'10 Acacia Avenue','01 sep 2010'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 Oct 2010'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 Nov 2010'
union all
select recnum = '100001', 1,'11 Banana Street','01 dec 2010'
union all
select recnum = '100001', 1,'11 Banana Street','01 jan 2011'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 feb 2011'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 mar 2011'
union all
select recnum = '100001', 1,'10 Acacia Avenue','01 apr 2011'
union all
select recnum = '100025', 0,'10 Downing Street','01 jan 2011'
union all
select recnum = '100025', 0,'1 Pennsylvania Avenue','01 feb 2011'
September 1, 2011 at 8:23 am
I see what the issue is here, but I havn't got time to give it much consideration at the moment. The issue is that you have the same person living at the same addresses for two separate periods, and you therefore want them counted separately. To get you started, you could search this site for "gaps and islands" and see whether that gives you any clues. Please post back and let us know how you're getting on.
John
September 2, 2011 at 8:32 am
Hello,
I tried to solve this problem in a chain of transformations, see the code below
create table #temp (
[RecNum] [nvarchar](6),
[RecInd] int,
[Address1] nvarchar(30),
[DateApplied] [datetime] NOT NULL
) ON [PRIMARY]
Insert #temp
select recnum = '100001', 1,'10 Acacia Avenue','01 sep 2010' union all
select recnum = '100001', 1,'10 Acacia Avenue','01 Oct 2010' union all
select recnum = '100001', 1,'10 Acacia Avenue','01 Nov 2010' union all
select recnum = '100001', 1,'11 Acacia Avenue','01 dec 2010' union all
select recnum = '100001', 1,'11 Acacia Avenue','01 jan 2011' union all
select recnum = '100001', 1,'10 Banana Street','01 feb 2011' union all
select recnum = '100001', 1,'10 Banana Street','01 mar 2011' union all
select recnum = '100001', 1,'10 Banana Street','01 apr 2011' union all
select recnum = '100025', 0,'10 Downing Street','01 jan 2011' union all
select recnum = '100025', 0,'1 Pennsylvania Avenue','01 feb 2011'
;
-- First step. Add a RowNumber to be able to join the table with herself
WITH OrderedAddresses AS (
SELECT RecNum, RecInd, Address1, DateApplied
, ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS RowNumber
FROM #Temp
)
-- Second step. Join the table with herself linking each row with the previous date;
-- then filter to avoid repeated addresses and maintain only address changes.
, FilteredAddresses AS (
SELECT A.*
FROM OrderedAddresses A
LEFT JOIN OrderedAddresses B ON B.RecNum = A.RecNum AND B.RecInd = A.RecInd AND A.RowNumber = B.Rownumber + 1
WHERE A.Address1 <> ISNULL(B.Address1, '')
)
-- Third step. Add another RowNumber, to apply another join.
, OrderAgain AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS NewRowNumber
FROM FilteredAddresses
)
-- Last step. Join the table with herself to obtain the date of leaving each address.
SELECT A.*, B.DateApplied AS EndDate
FROM OrderAgain A
LEFT JOIN OrderAgain B ON B.RecNum = A.RecNum AND B.RecInd = A.RecInd AND B.NewRowNumber = A.NewRowNumber + 1
drop table #temp
Hope this helps,
Francesc
September 8, 2011 at 7:31 am
Thanks Francesc, that works a dream! Good concept to work with in the future. Many thanks.
September 13, 2011 at 3:54 am
I have a slight problem with this and I'm really struggling to amend the code to cope with it. I have some really awkward files where the first address1 field is a null. Code falls over then. How would I cater for them please?
September 13, 2011 at 4:19 am
Maybe yo can replace these null values by some not null string. I tried replacing
WITH OrderedAddresses AS (
SELECT RecNum, RecInd, Address1, DateApplied
, ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS RowNumber
FROM #Temp
)
by
WITH OrderedAddresses AS (
SELECT RecNum, RecInd, ISNULL(Address1, 'No address') AS Address1, DateApplied
, ROW_NUMBER() OVER (PARTITION BY RecNum, RecInd ORDER BY DateApplied) AS RowNumber
FROM #Temp
)
Francesc
September 13, 2011 at 7:13 am
Aha. That seems to have worked. I (stupidly) was concentrating on the second CTE (filteredaddresses) and was testing isnull(A.Address1,'') <> ISNULL(B.Address1, '')) in my where clause. Thanks again.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply