November 6, 2015 at 5:19 am
Hi,
Need help to shred pipe delimmited data into a table, cant seem to find anything that will help me do this.
for example the data looks roughly like this
234|2015-6-11|234aa|1
ID1|a|A||234|ASDFASDF|2015-06-11|A|s||NULL|||||TEST|TEST|TEST|TEST|1543211
ID2|B|A||234|ASDFASDF|2015-06-11|A|s||NULL|||||TEST|TEST|TEST|TEST|1234
I have approximatly 200 rows of these type of lines of data can anyone provide a potential awnser that can help me add each column into the same row of a temp table
November 6, 2015 at 5:49 pm
xxxy2k (11/6/2015)
Hi,Need help to shred pipe delimmited data into a table, cant seem to find anything that will help me do this.
for example the data looks roughly like this
234|2015-6-11|234aa|
ID1|a|A||234|ASDFASDF|2015-06-11|A|s||NULL|||||TEST|TEST|TEST|TEST|1543211
ID2|B|A||234|ASDFASDF|2015-06-11|A|s||NULL|||||TEST|TEST|TEST|TEST|1234
I have approximatly 200 rows of these type of lines of data can anyone provide a potential awnser that can help me add each column into the same row of a temp table
Considering that the first row only has 4 elements (one of them is a trailing empty element) and the rest of more than that, what do you want to do about such a disparity in the number of elements?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 9, 2015 at 2:39 am
The sample of data im using is test data, i wouldnt get hung up about the quality of the data. I've amended the data you highlighted to explain the data further first line is a header row, 2nd row of data and onwards is the actual data.
November 9, 2015 at 6:46 am
There's two parts to this. First is getting it into SQL, which you can do like this: (props to whoever provided this sample. it didn't say)
DECLARE @FileContents VARCHAR(MAX)
SELECT @FileContents=BulkColumn
FROM OPENROWSET(BULK'PathToYourFile.sql',SINGLE_BLOB) x;
Second part is just use Moden's string splitter and insert the result into a table.
November 13, 2015 at 4:01 am
Final peice will be something like this
select
dbo.GetDomain(t.test,'|',0) as col2,
dbo.GetDomain(t.test,'|',1) as col3,
dbo.GetDomain(t.test,'|',2) as col4
from #test t
go
November 13, 2015 at 6:27 am
xxxy2k (11/13/2015)
Final peice will be something like thisselect
dbo.GetDomain(t.test,'|',0) as col2,
dbo.GetDomain(t.test,'|',1) as col3,
dbo.GetDomain(t.test,'|',2) as col4
from #test t
go
Please post your "GetDomain" function so that I can show you why you shouldn't actually be doing it that way. 😉
Also, you said...
The sample of data im using is test data, i wouldn't get hung up about the quality of the data.
That's totally obvious. I just wanted to know what you wanted to do with that first row in the data because you never explicitly stated what you want to do with it. Do you just want to ignore it? This is a fairly easy problem. I just need to know the answer to that question.
And, again, the reason why I want to see your "GetDomain" function is to show you why you don't want to do such a thing in the future especially if you ever have a large amount of data to do this to.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 13, 2015 at 6:34 am
If i could ignore it then great if not i would just filter it out in the where clause once in a table.
The getdomain is a function see below
create function dbo.GetDomain(
@source varchar(1024),
@delimiter varchar(10),
@domain int
) returns varchar(1024) as begin
declare @returnValue varchar(1024)
declare @workingOn int
declare @length int
set @workingOn=0
while @workingOn<@domain begin
set @source=substring(@source,charindex(@delimiter,@source)+1,1024)
set @workingOn+=1
end
set @length=charindex(@delimiter,@source)
set @returnValue=substring(@source,1,case when @length=0 then 1024 else @length-1 end)
return @returnValue
end
GO
November 13, 2015 at 6:50 pm
Ok... first things first. Part of the reason why you've gotten no actual code to help you for the split portion of this problem is because you haven't posted readily consumable test data. Folks like to test their stuff before posting it and I'm no exception. I've converted your test data to readily consumable data to demonstrate with because you're new. I strongly recommend that you read the article at the first link in my signature line below under "Helpful Links" for future posts. You get tested, coded replies that way.
Thank you for posting your GetDomain splitter. As I suspected, it contains a slow While Loop to do the splits, not to mention that it's a Scalar Function rather than a high performance iTVF (Inline Table Valued Function).
Without further ado, take a trip over to the following article and get yourself a copy of the "DelimitedSplit8K" function from the "Resources" section near the bottom of the article. Once you've setup that function, this problem becomes child's play and will only be beat for performance by a CLR in the pre-2012 world.
http://www.sqlservercentral.com/articles/Tally+Table/72993/
Here's the code for a solution. Change column names and the table names to suit you. The result is stored in a Temp Table that's created on-the-fly. It also assumes the "IDx" at the beginning of each row of data is going to be unique. If it's not, there's a fix we can apply for that.
And, yes, I included a test table like you should in the future. You'll get tested answers much more quickly that way.
--===== Create the test table.
-- This is NOT a part of the solution. We're just creating test data here.
IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
DROP TABLE #TestTable
;
SELECT *
INTO #TestTable
FROM (
SELECT '234|2015-6-11|234aa|1' UNION ALL
SELECT 'ID1|a|A||234|ASDFASDF|2015-06-11|A|s||NULL|||||TEST|TEST|TEST|TEST|1543211' UNION ALL
SELECT 'ID2|B|A||234|ASDFASDF|2015-06-11|A|s||NULL|||||TEST|TEST|TEST|TEST|1234'
) d (PipeData)
;
--===== This does the split and repivot as well as creating and populating
-- the desired Temp Table on-the-fly.
SELECT C01 = SUBSTRING(tt.PipeData,1,CHARINDEX('|',tt.PipeData)-1)
,C02 = MAX(CASE WHEN split.ItemNumber = 2 THEN split.Item ELSE '' END)
,C03 = MAX(CASE WHEN split.ItemNumber = 3 THEN split.Item ELSE '' END)
,C04 = MAX(CASE WHEN split.ItemNumber = 4 THEN split.Item ELSE '' END)
,C05 = MAX(CASE WHEN split.ItemNumber = 5 THEN split.Item ELSE '' END)
,C06 = MAX(CASE WHEN split.ItemNumber = 6 THEN split.Item ELSE '' END)
,C07 = MAX(CASE WHEN split.ItemNumber = 7 THEN split.Item ELSE '' END)
,C08 = MAX(CASE WHEN split.ItemNumber = 8 THEN split.Item ELSE '' END)
,C09 = MAX(CASE WHEN split.ItemNumber = 9 THEN split.Item ELSE '' END)
,C10 = MAX(CASE WHEN split.ItemNumber = 10 THEN split.Item ELSE '' END)
,C11 = MAX(CASE WHEN split.ItemNumber = 11 THEN split.Item ELSE '' END)
,C12 = MAX(CASE WHEN split.ItemNumber = 12 THEN split.Item ELSE '' END)
,C13 = MAX(CASE WHEN split.ItemNumber = 13 THEN split.Item ELSE '' END)
,C14 = MAX(CASE WHEN split.ItemNumber = 14 THEN split.Item ELSE '' END)
,C15 = MAX(CASE WHEN split.ItemNumber = 15 THEN split.Item ELSE '' END)
,C16 = MAX(CASE WHEN split.ItemNumber = 16 THEN split.Item ELSE '' END)
,C17 = MAX(CASE WHEN split.ItemNumber = 17 THEN split.Item ELSE '' END)
,C18 = MAX(CASE WHEN split.ItemNumber = 18 THEN split.Item ELSE '' END)
,C19 = MAX(CASE WHEN split.ItemNumber = 19 THEN split.Item ELSE '' END)
,C20 = MAX(CASE WHEN split.ItemNumber = 20 THEN split.Item ELSE '' END)
INTO #SplitResult
FROM #TestTable tt
CROSS APPLY dbo.DelimitedSplit8k(tt.PipeData,'|') split
WHERE tt.PipeData LIKE 'ID%'
GROUP BY SUBSTRING(tt.PipeData,1,CHARINDEX('|',tt.PipeData)-1)
;
--===== Show that our data has been correctly split
SELECT * FROM #SplitResult
;
The "pivot" method that uses MAX(CASE)) that I used is an ancient but very fast method known as a CROSS TAB. You can read about that at the following link.
http://www.sqlservercentral.com/articles/T-SQL/63681/
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply