January 9, 2011 at 2:54 pm
Hi All
I have a rather easy looking yet complex sql query. Just wondering if anyone could shed some light on it? Basically i have a row which has a column with a comma seperated string... BUT the client wants this converted to expand over multiple rows. Example below.
Thanks very much in advance!
My original data
Col1 Col2 Col3
John Doe 12,34,4.5,9,10
Jane Doe 3,9,9
Desired Outcome
Col1 Col2 Col3
John Doe 12
John Doe 34
John Doe 4.5
John Doe 9
John Doe 10
Jane Doe 3
Jane Doe 9
Jane Doe 9
January 9, 2011 at 3:11 pm
Read this article : http://www.sqlservercentral.com/articles/T-SQL/62867/
Especially looking out for the section One Final "Split" Trick with the Tally Table
But really, read the article and understand it before you use the code!
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 9, 2011 at 4:25 pm
Thanks MM. I've just finished reading the entire article and will start working on my sql now 🙂
If anyone already has a sql already done please feel free to post it as well :hehe:
January 9, 2011 at 5:38 pm
Hey MM, I've just finished my script and it works a treat!!! Thanks so much for your help 😀
January 9, 2011 at 6:37 pm
Sichy (1/9/2011)
Hey MM, I've just finished my script and it works a treat!!! Thanks so much for your help 😀
No problem - it was the article that helped I hope - thanks for the feedback - if you feel like posting your script, others may find it and be helped in turn 😀
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 9, 2011 at 6:47 pm
Yes of course here is the script
Step 1 - Create the Tally Table
--=============================================================================
-- Setup. Duration - 00:02:31 Rows - 158,533,281
--=============================================================================
USE Source_Cleansed_DM
SET NOCOUNT ON --Supress the auto-display of rowcounts for appearance/speed
DECLARE @StartTime DATETIME --Timer to measure total duration
SET @StartTime = GETDATE() --Start the timer
--=============================================================================
-- Create and populate a Tally table
--=============================================================================
--===== Conditionally drop
IF OBJECT_ID('dbo.Tally') IS NOT NULL
DROP TABLE dbo.Tally
--===== Create and populate the Tally table on the fly
SELECT TOP 158533281 --158,533,281
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Let the public use it
GRANT SELECT, REFERENCES ON dbo.Tally TO PUBLIC
--===== Display the total duration - 221830 Milliseconds duration
SELECT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Milliseconds duration'
select COUNT(*) from dbo.Tally
Step 2 and 3 - Create temp table and run split to normalize table. (i've commented out the create sample table as this only needs to be run once obviously)
--===== Create a sample denormalized table with a CSV column
-- CREATE TABLE #testtable
-- (PK INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
-- ,Fact1 nvarchar(20)
--,Fact2 nvarchar(20)
-- ,[Values] nvarchar(max))
--insert into #testtable (Fact1, Fact2, [Values])
--VALUES ('test1','Fact2','1,2.9,4,9')
--insert into #testtable (Fact1, Fact2, [Values])
--VALUES ('test3','Fact4','2,6,7.3,5.5')
--insert into #testtable (Fact1, Fact2, [Values])
--VALUES ('test5','Fact4','4,7,9')
--insert into #testtable (Fact1, Fact2, [Values])
--VALUES ('test1','Fact9','114,514,5,343.45,8,234,5,912,345')
--insert into #testtable (Fact1, Fact2, [Values])
--VALUES ('test1','Fact2','11,4514,5,343.45,82,345,9,1,2345')
--===== Split or "Normalize" the whole table at once
SELECT tt.PK,tt.Fact1,tt.Fact2,
SUBSTRING(','+tt.[Values]+',',N+1,CHARINDEX(',',','+tt.[Values]+',',N+1)-N-1) AS Value
FROM dbo.Tally t
CROSS JOIN #testtable tt
WHERE t.N < LEN(','+tt.[Values]+',')
AND SUBSTRING(','+tt.[Values]+',',N,1) = ','
January 9, 2011 at 9:24 pm
Sichy (1/9/2011)
I'm really curious about this line of code:
SELECT TOP 158533281 --158,533,281
What does that number represent? It's a really odd number to use - why not 200 million?
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
January 9, 2011 at 11:07 pm
WayneS (1/9/2011)
Sichy (1/9/2011)
Sichy,I'm really curious about this line of code:
SELECT TOP 158533281 --158,533,281
What does that number represent? It's a really odd number to use - why not 200 million?
I'll add to that... why so big? Are you really splitting strings with that many bytes in them?
--Jeff Moden
Change is inevitable... Change for the better is not.
January 10, 2011 at 12:44 am
Good question! At first i actually didn't have access to the table and i couldn't imagine how many rows it would initialy have so i simply put 999999999999... When i ran the tally table and did a count of rows i found that it actually maxed out at 158,533,281. So when i needed to create the tally table on a different DB i just replaced my 99999999999 with 158,533,281 to save a few mins on the tally table creation.
p.s. Turns out the initial table only has about 900,000 rows so now I've changed the number to 1,000,000 to save even more time.
January 10, 2011 at 12:45 am
Oh forgot to mention, Thanks Jeff Moden for writing the article 🙂
January 10, 2011 at 7:51 am
Sichy (1/10/2011)
Good question! At first i actually didn't have access to the table and i couldn't imagine how many rows it would initialy have so i simply put 999999999999... When i ran the tally table and did a count of rows i found that it actually maxed out at 158,533,281. So when i needed to create the tally table on a different DB i just replaced my 99999999999 with 158,533,281 to save a few mins on the tally table creation.p.s. Turns out the initial table only has about 900,000 rows so now I've changed the number to 1,000,000 to save even more time.
There's a bit of a misnomer that you may have taken. The Tally Table [font="Arial Black"]doesn't [/font]need the same number of rows as the table data that you're trying to split. It only needs the same number of rows as the largest number of bytes in a column that you're trying to split. Generally speaking, the largest datatype most folks need to split is VARCHAR(8000). With that in mind, the largest your Tally Table needs to be to be able to do splits on such a column is only 8000 numbers.
And, thanks for the feedback on the article.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 13, 2011 at 3:47 pm
Ahhh yes i see. Thanks again Jeff your help is greatly appreciated!
JS
January 13, 2011 at 7:25 pm
Sichy (1/13/2011)
Ahhh yes i see. Thanks again Jeff your help is greatly appreciated!JS
You bet. Thanks for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply