December 13, 2010 at 3:09 pm
I need to rollup data that looks like this:
personnofromdatetodateacctstatcdmcatcd membgroup statgroup hqfieldcd changed_bychanged_datefieldcd tempfldcd
0000061976-06-01 00:00:00.0001977-11-30 00:00:00.000ENROR0103USCONV1998-08-26 00:00:00INDNULL
0000061977-12-01 00:00:00.0001981-03-31 00:00:00.000SERVR0104USCONV1998-08-26 00:00:00INDNULL
0000061981-04-01 00:00:00.0001982-04-30 00:00:00.000FREGR0105USBCROS2008-10-21 00:00:00INDNULL
0000061983-02-01 00:00:00.0001984-11-30 00:00:00.000ENROR0307USCONV1998-08-26 00:00:00INDNULL
0000062006-06-01 00:00:00.0002006-07-26 00:00:00.000AWAIR0611USLHECK2006-05-22 00:00:00NULL NULL
0000062006-07-27 00:00:00.0002008-04-16 00:00:00.000ENROR0612USDCURT2006-07-27 00:00:00INDNULL
0000062008-04-17 00:00:00.0002009-04-13 00:00:00.000SERVR0613USKWHAT2009-06-22 00:00:00INDNULL
0000062009-04-14 00:00:00.0002009-06-28 00:00:00.000SERVR0613USKWHAT2009-06-22 00:00:00INDNULL
0000062009-06-29 00:00:00.0002009-07-29 00:00:00.000FPERR0614USSSMIT2009-06-29 00:00:00INDNULL
0000062009-07-30 00:00:00.0002009-08-31 00:00:00.000FREGR0615USSSMIT2009-06-29 00:00:00INDNULL
0000062009-09-01 00:00:00.0002010-06-30 00:00:00.000SERVR0616USSSMIT2009-06-29 00:00:00INDNULL
0000062010-07-01 00:00:00.0002010-07-31 00:00:00.000FPERR0617USSSMIT2010-11-02 00:00:00NULL NULL
0000062010-08-01 00:00:00.0002010-11-30 00:00:00.000FREGR0618USSSMIT2010-11-02 00:00:00NULL NULL
0000062010-12-01 00:00:00.000NULL INTRR0619USSSMIT2010-11-02 00:00:00NULL NULL
To this:
personnomcatcdacctstatcdfromdatetodatefieldcd hqfieldcd
000006RFREG1976-06-011982-04-30INDUS
000006RENRO1983-02-011984-11-30INDUS
000006RAWAI2006-06-012006-07-26USUS
000006RSERV2006-07-272010-06-30INDUS
000006RINTR2010-07-01NULLUSUS
Here's the code to generate sample data:
USE [tempdb]
GO
CREATE TABLE [pershist]
(
[personno] [varchar](6) NOT NULL,
[fromdate] [datetime] NOT NULL,
[todate] [datetime] NULL,
[acctstatcd] [varchar](10) NOT NULL,
[mcatcd] [varchar](10) NOT NULL,
[membgroup] [varchar](2) NOT NULL,
[statgroup] [varchar](2) NOT NULL,
[hqfieldcd] [varchar](4) NOT NULL,
[changed_by] [varchar](50) NULL,
[changed_date] [smalldatetime] NULL,
[fieldcd] [varchar](4) NULL,
[tempfldcd] [varchar](4) NULL,
CONSTRAINT [PK_pershist] PRIMARY KEY CLUSTERED ([personno] ASC, [fromdate] ASC, [acctstatcd] ASC, [mcatcd] ASC)
)
GO
INSERT INTO
pershist
SELECT
'000006' AS personsno,
'1976-06-01 00:00:00.000' AS fromdate,
'1977-11-30 00:00:00.000' AS todate,
'ENRO' AS acctstatcd,
'R' AS mcatcd,
'01' AS membgroup,
'03' AS statgroup,
'US' AS hqfieldcd,
'CONV',
'1998-08-26 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'1977-12-01 00:00:00.000',
'1981-03-31 00:00:00.000',
'SERV',
'R',
'01',
'04',
'US',
'CONV',
'1998-08-26 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'1981-04-01 00:00:00.000',
'1982-04-30 00:00:00.000',
'FREG',
'R',
'01',
'05',
'US',
'BCROS',
'2008-10-21 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'1983-02-01 00:00:00.000',
'1984-11-30 00:00:00.000',
'ENRO',
'R',
'03',
'07',
'US',
'CONV',
'1998-08-26 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2006-06-01 00:00:00.000',
'2006-07-26 00:00:00.000',
'AWAI',
'R',
'06',
'11',
'US',
'LHECK',
'2006-05-22 00:00:00',
NULL,
NULL
UNION ALL
SELECT
'000006',
'2006-07-27 00:00:00.000',
'2008-04-16 00:00:00.000',
'ENRO',
'R',
'06',
'12',
'US',
'DCURT',
'2006-07-27 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2008-04-17 00:00:00.000',
'2009-04-13 00:00:00.000',
'SERV',
'R',
'06',
'13',
'US',
'KWHAT',
'2009-06-22 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-04-14 00:00:00.000',
'2009-06-28 00:00:00.000',
'SERV',
'R',
'06',
'13',
'US',
'KWHAT',
'2009-06-22 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-06-29 00:00:00.000',
'2009-07-29 00:00:00.000',
'FPER',
'R',
'06',
'14',
'US',
'SSMIT',
'2009-06-29 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-07-30 00:00:00.000',
'2009-08-31 00:00:00.000',
'FREG',
'R',
'06',
'15',
'US',
'SSMIT',
'2009-06-29 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-09-01 00:00:00.000',
'2010-06-30 00:00:00.000',
'SERV',
'R',
'06',
'16',
'US',
'SSMIT',
'2009-06-29 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2010-07-01 00:00:00.000',
'2010-07-31 00:00:00.000',
'FPER',
'R',
'06',
'17',
'US',
'SSMIT',
'2010-11-02 00:00:00',
NULL,
NULL
UNION ALL
SELECT
'000006',
'2010-08-01 00:00:00.000',
'2010-11-30 00:00:00.000',
'FREG',
'R',
'06',
'18',
'US',
'SSMIT',
'2010-11-02 00:00:00',
NULL,
NULL
UNION ALL
SELECT
'000006',
'2010-12-01 00:00:00.000',
NULL,
'INTR',
'R',
'06',
'19',
'US',
'SSMIT',
'2010-11-02 00:00:00',
NULL,
NULL
GO
IF OBJECT_ID('tempdb.dbo.pershist') IS NOT NULL
BEGIN
DROP TABLE pershist
END
Basically this data is a set of work assignment locations and statuses for an employee and I essentially need to create "header" rows for each assignment to a location that is contiguous. This is based on personno, membgroup, and fieldcd. Fieldcd is actually determined using COALESCE(tempfldcd, fieldcd, hqfieldcd).
I thought I could get RANK() (or DENSE_RANK) to work but it with this specific set of data the row with data with a fromdate of 2006-06-01 ends up in the same group as the rows beginning with the row with the fromdate of 2010-07-01 and I need these to be separate rows in the rolled up data because they are not contiguous. I'm thinking a combination of RANK() and ROW_NUMBER() (to get contiguous rows), but thought someone here might have a better way.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
December 13, 2010 at 5:49 pm
Jack,
How are you deciding which AcctStatCD to use here? Is it always the final one, can it be any, or is there a precidence order? For example, in the first rollup, there's ENRO, FREG, and SERV as options for the rollup.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 13, 2010 at 6:12 pm
Jack,
This is close. I *believe* it should work for you, but I didn't have a chance to expand the dataset to make sure I didn't step on anything. Once you let us know how you want to decide the other columns, we can include those into this.
R3 is superfluous, I left it in for your larger data set reviewing of the results.
One thing I am getting though is that MAX() is ignoring your NULL, so you'll notice the final entry is a GETDATE() swap via an ISNULL().
;WITH cte AS
(select
personno,
fromdate,
todate,
membgroup,
ISNULL( fieldcd, 'US') AS FieldCD,
Row_Number() OVER ( Partition BY PersonNo Order By FromDate ) AS RowNum,
Row_Number() OVER ( Partition BY PersonNo, membgroup Order By FieldCD, FromDate ) AS RowNum2,
Row_Number() OVER ( Partition BY PersonNo, membgroup, FieldCD Order By FromDate ) AS RowNum3
FROM
pershist
)
, cte2 AS
(select
*,
RowNum - RowNum2 AS r2,
RowNum - RowNum3 AS r3
FROM
cte
)
SELECT
Personno,
--MAX( mCatCD) AS mCatCD,
--MAX( AcctStatCD) AS AcctStatCD,
MIN( FromDate) AS MinFromDate,
MAX( ISNULL( ToDate, GETDATE())) AS MaxToDate,
FieldCD,
'US' AS HQFieldCD
FROM
cte2
GROUP BY
PersonNo,
FieldCD,
r2
ORDER BY
PersonNo,
MIN( FromDate)
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 13, 2010 at 8:17 pm
Craig,
Looks like it might work. I'm still trying to understand how/why it works but it does look good.
Here's what's needed for the rest of the columns:
acctstatcd needs to be the most recent/last for each grouping
fromdate should be the min fromdate for each group
todate should be the max todate for each group with NULL being the max in the case where there is a NULL. I can't actually use getdate() for the nulls because it is possible to have a future todate. Don't ask me why because I don't know, I just know it happens.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
December 13, 2010 at 8:46 pm
Okay, here's what I've got now:
USE [tempdb]
GO
CREATE TABLE [pershist]
(
[personno] [varchar](6) NOT NULL,
[fromdate] [datetime] NOT NULL,
[todate] [datetime] NULL,
[acctstatcd] [varchar](10) NOT NULL,
[mcatcd] [varchar](10) NOT NULL,
[membgroup] [varchar](2) NOT NULL,
[statgroup] [varchar](2) NOT NULL,
[hqfieldcd] [varchar](4) NOT NULL,
[changed_by] [varchar](50) NULL,
[changed_date] [smalldatetime] NULL,
[fieldcd] [varchar](4) NULL,
[tempfldcd] [varchar](4) NULL,
CONSTRAINT [PK_pershist] PRIMARY KEY CLUSTERED ([personno] ASC, [fromdate] ASC, [acctstatcd] ASC, [mcatcd] ASC)
)
GO
INSERT INTO
pershist
SELECT
'000006' AS personsno,
'1976-06-01 00:00:00.000' AS fromdate,
'1977-11-30 00:00:00.000' AS todate,
'ENRO' AS acctstatcd,
'R' AS mcatcd,
'01' AS membgroup,
'03' AS statgroup,
'US' AS hqfieldcd,
'CONV',
'1998-08-26 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'1977-12-01 00:00:00.000',
'1981-03-31 00:00:00.000',
'SERV',
'R',
'01',
'04',
'US',
'CONV',
'1998-08-26 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'1981-04-01 00:00:00.000',
'1982-04-30 00:00:00.000',
'FREG',
'R',
'01',
'05',
'US',
'BCROS',
'2008-10-21 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'1983-02-01 00:00:00.000',
'1984-11-30 00:00:00.000',
'ENRO',
'R',
'03',
'07',
'US',
'CONV',
'1998-08-26 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2006-06-01 00:00:00.000',
'2006-07-26 00:00:00.000',
'AWAI',
'R',
'06',
'11',
'US',
'LHECK',
'2006-05-22 00:00:00',
NULL,
NULL
UNION ALL
SELECT
'000006',
'2006-07-27 00:00:00.000',
'2008-04-16 00:00:00.000',
'ENRO',
'R',
'06',
'12',
'US',
'DCURT',
'2006-07-27 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2008-04-17 00:00:00.000',
'2009-04-13 00:00:00.000',
'SERV',
'R',
'06',
'13',
'US',
'KWHAT',
'2009-06-22 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-04-14 00:00:00.000',
'2009-06-28 00:00:00.000',
'SERV',
'R',
'06',
'13',
'US',
'KWHAT',
'2009-06-22 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-06-29 00:00:00.000',
'2009-07-29 00:00:00.000',
'FPER',
'R',
'06',
'14',
'US',
'SSMIT',
'2009-06-29 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-07-30 00:00:00.000',
'2009-08-31 00:00:00.000',
'FREG',
'R',
'06',
'15',
'US',
'SSMIT',
'2009-06-29 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2009-09-01 00:00:00.000',
'2010-06-30 00:00:00.000',
'SERV',
'R',
'06',
'16',
'US',
'SSMIT',
'2009-06-29 00:00:00',
'IND',
NULL
UNION ALL
SELECT
'000006',
'2010-07-01 00:00:00.000',
'2010-07-31 00:00:00.000',
'FPER',
'R',
'06',
'17',
'US',
'SSMIT',
'2010-11-02 00:00:00',
NULL,
NULL
UNION ALL
SELECT
'000006',
'2010-08-01 00:00:00.000',
'2010-11-30 00:00:00.000',
'FREG',
'R',
'06',
'18',
'US',
'SSMIT',
'2010-11-02 00:00:00',
NULL,
NULL
UNION ALL
SELECT
'000006',
'2010-12-01 00:00:00.000',
NULL,
'INTR',
'R',
'06',
'19',
'US',
'SSMIT',
'2010-11-02 00:00:00',
NULL,
NULL
GO
;WITH cte AS
(select
personno,
fromdate,
todate,
membgroup,
mcatcd,
acctstatcd,
COALESCE(tempfldcd, fieldcd, hqfieldcd) AS FieldCD,
Row_Number() OVER ( Partition BY PersonNo Order By FromDate ) AS RowNum,
Row_Number() OVER ( Partition BY PersonNo, membgroup Order By FieldCD, FromDate ) AS RowNum2,
Row_Number() OVER ( Partition BY PersonNo, membgroup, FieldCD Order By FromDate ) AS RowNum3
FROM
pershist
)
, cte2 AS
(select
*,
RowNum - RowNum2 AS r2,
RowNum - RowNum3 AS r3
FROM
cte
)
, cte3 AS
(
SELECT
Personno,
mCatCD,
acctstatcd,
MIN( FromDate) AS MinFromDate,
MAX( ISNULL( ToDate, GETDATE())) AS MaxToDate,
FieldCD,
'US' AS HQFieldCD,
r2 AS assignment_group,
ROW_NUMBER() OVER (PARTITION BY r2 ORDER BY MIN(Fromdate)) AS group_order,
ROW_NUMBER() OVER (PARTITION BY r2 ORDER BY MIN(Fromdate) desc) AS group_order_desc
FROM
cte2 A
GROUP BY
PersonNo,
FieldCD,
mcatcd,
acctstatcd,
r2
)
SELECT
A.personno,
A.HQFieldCD,
A.FieldCD,
A.mcatcd,
B.acctstatcd,
A.MinFromDate,
B.MaxToDate
FROM
cte3 A JOIN
cte3 B
ON A.assignment_group = B.assignment_group AND
A.personno = B.personno AND
A.HQFieldCD = B.HQFieldCD AND
A.mcatcd = B.mcatcd
WHERE
A.group_order = 1 AND
B.group_order_desc = 1
IF OBJECT_ID('tempdb.dbo.pershist') IS NOT NULL
BEGIN
DROP TABLE pershist
END
This gives me the right results for this specific case. I can check it against the rest of the data in the morning.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
December 16, 2010 at 2:16 pm
Hey Jack, sorry I fell off the earth on this, kind of got distracted with my own local insanity.
How'd this end up working out for you?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 16, 2010 at 8:25 pm
Seems to be working correctly. I did have to use r3 instead of r2 for later grouping because I found instances where r2 was not grouping correctly.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply