May 6, 2013 at 11:51 am
The problem here is that the grpID field is sometimes loaded with multiple values for a single element of the set. The goal of this program is to split those incidences into separate rows for each individual value (whenever an incidence of multiple GroupId values in a single element occurs, there is exactly 10 characters in between then), then additionally eliminate an unnecessary prefix modifer (either "." or "-"). This iterative based solution works, but obviously it is clearly non-standard and will not scale well once put into production. I was hoping someone with experience might be able to point me in the right direction as to how this solution can be transformed into a set-based approach. Thanks.
USE OPMSupportStage;
GO
DECLARE @counterValue INT
DECLARE @numberOfRecords INT
DECLARE @subStrStart INT
DECLARE @grpID VARCHAR(30)
DECLARE @x CHAR(1)
SELECT @numberOfRecords = COUNT(*) FROM dbo.Active
SET @counterValue = 1;
DELETE
FROM dbo.ActiveFinal;
DBCC CHECKIDENT ('dbo.ActiveFinal', RESEED, 1);
WHILE @counterValue <= @numberOfRecords + 25
BEGIN
SELECT @grpID = LEN(d.MAGrp)
FROM dbo.Active As D
WHERE d.P_Id = @counterValue
SET @subStrStart = 1;
WHILE @subStrStart < @grpID
BEGIN
INSERT INTO dbo.ActiveFinal(GrpName2, MAGrp2, MOSGrp2, EffectiveDt2)
SELECT e.GrpName, SUBSTRING(e.MAGrp, @subStrStart, 9), e.MOSGrp, e.EffectiveDt
FROM dbo.Active As E
WHERE e.P_Id = @counterValue
SET @subStrStart = @subStrStart + 10;
END
SELECT @x = SUBSTRING(MAGrp2, 6, 1) FROM dbo.ActiveFinal WHERE P_Id = @counterValue
IF @x = '-'
BEGIN
UPDATE dbo.ActiveFinal
SET MAGrp2 = REPLACE(MaGrp2,'-','')
WHERE P_Id = @counterValue
END
ELSE IF @x = '.'
BEGIN
UPDATE dbo.ActiveFinal
SET MAGrp2 = REPLACE(MaGrp2,'.','')
WHERE P_Id = @counterValue
END
SET @counterValue = @counterValue + 1;
END
GO
May 6, 2013 at 2:52 pm
Hi
Without some example data this is just a guess at how you could do. The following example depends on the ids being 9 characters long with an arbitrary delimiter.
Hope this helps
with sampledata as (
-- Guess at how the data may look
SELECT *
FROM (VALUES
('aaaaa-aaa','one')
,('bbbbbbbbb.ccccccccc','two')
,('ddddddddd-eeeee.eee-fffffffff','three')
,('ggggggggg.hhhhhhhhh-iiiiiiiii.jjjjjjjjj','four')
,('kkkk-kkkk-lllllllll.mmmmmmmmm.nnnnnnnnn-ooooooooo','five')
,('ppppppppp.qqqqqqqqq|rrrrrrrrr|sssssssss,ttttttttt;uuuuuuuuu','six')
,('vvvvvvvvv.wwwwwwwww.xxxxxxxxx.yyyyyyyyy.zzzzzzzzz.000000000.111111111','seven')
) AS SD(ID,VALUE)
)
-- Little tally table to split the id with
,smallTally as (
SELECT * FROM (VALUES (0), (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) AS T(N)
)
-- Unpivot the id column
,unpivotID as (
SELECT SUBSTRING(s.ID,(t.N * 10) + 1,9) ID -- Get exactly 9 characters every 10th position
, s.VALUE, N
FROM sampledata s
cross apply (
SELECT TOP ((len(s.id) / 10) + 1) -- Determine the number of IDs to unpivot
N
FROM smallTally
ORDER BY N) t
)
-- Replace unnecessary prefix modifer in character location 6
SELECT CASE WHEN CHARINDEX('-',REPLACE(ID,'.','-')) = 6 THEN STUFF(ID,6,1,'') ELSE ID END ID
, VALUE
FROM unpivotID
May 6, 2013 at 3:23 pm
the problem is i don't have a terminating set of distorted elements in the set. the monthly unload could produce anywhere from 100's to 1000's of instances of duplicate i.d.s. it's more of an issue with poor input validation on source data, but that problem is simply out of my control.
May 6, 2013 at 4:40 pm
If you could post some sample data showing some of the issues and your expected results we can get a clearer idea of what you require.
Here's a link that details best practice
http://www.sqlservercentral.com/articles/Best+Practices/61537/
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply