October 25, 2007 at 5:57 am
any sql heads know a good way to turn a delimited list of values in a single column into a list of columns
e.g. Source Data
[font="Courier New"]
ID, Data
1, a;b;c
2, b
3,
4, e;f;g;h;j;k
5, a;g;
6, h;i;g[/font]
If I have say `10 column max possible items`, I want to turn things into
[font="Courier New"]ID, Col1, Col2, Col3, Col4, Col5, Col6, Col7, Col8, Col9, Col10
1, a , b , c , , , , , , ,
2, b , , , , , , , , ,
3, , , , , , , , , ,
4, e , f , g , h , j , k , , , ,
5, a , g , , , , , , , ,
6, h , i , g , , , , , , ,[/font]
Ugly, I know (and excuse any lost formatting, columns should line up there!)
Before anyone suggests lots of alternative methods of storing the data in the first place, im affraid this is what im stuck with, the requirements can't change.
We have optional information stored as a delimited list in a single field (as in the example Source Data above), but an external company which will process this data require it as `column seperated` to process.
The contents of options may change at any time, hence why there are no hard coded column's for particular values (would need hundreds of columns otherwise). There are a max number of values (hence in my example a possible max of 10 columns).
All processing is to be done directly on the sql server, no other application/technology processing will touch it.
Performance is not an issue as long as processing x0,000 records remains in the seconds rather than minutes ballpark.
Preferably I want things to work so that data is formed into a query result set which will be used as an export source for a data file, and possibly *shiver* for reporting.
My first test was to call a SP for each row, that splits up the data, returning (where required) the results as part of a set series of parameters (see below proof of concept code which does this job - approx 6 seconds to select 210,000 rows of non processed data, and 67 seconds when the processing is undertaken on these records)
But before this I thought id consult the experts for opinions, and (im guessing) some far more elegant suggestions as I don't like the one ive come up with.
many thanks
martin
.........
test code that actually ejects the resulting dataset at the end ok
[font="Courier New"]
DECLARE
@col0 nvarchar(32),
@col1 nvarchar(32),
@col2 nvarchar(32),
@col3 nvarchar(32),
@col4 nvarchar(32),
@col5 nvarchar(32),
@col6 nvarchar(32),
@col7 nvarchar(32),
@col8 nvarchar(32),
@col9 nvarchar(32),
@overflow bit,
@cursor CURSOR,
@pk int,
@list nvarchar(1024)
CREATE TABLE #tmptable (
pk int,
originalrequest nvarchar(1024),
col0 nvarchar(32),
col1 nvarchar(32),
col2 nvarchar(32),
col3 nvarchar(32),
col4 nvarchar(32),
col5 nvarchar(32),
col6 nvarchar(32),
col7 nvarchar(32),
col8 nvarchar(32),
col9 nvarchar(32),
overflow bit
)
SET @cursor = CURSOR
FOR
-- example field with some ; delimited values in it
SELECT LogId, Tag2 FROM [Log] WHERE [Description] = 'Tagged For Export' AND Tag2 Is Not Null
OPEN @cursor
FETCH NEXT FROM @cursor
INTO @pk, @list
WHILE @@FETCH_STATUS = 0
BEGIN
EXEC dbo.fn_List2Columns @list, ';', @col0 OUTPUT, @col1 OUTPUT, @col2 OUTPUT, @col3 OUTPUT, @col4 OUTPUT, @col5 OUTPUT, @col6 OUTPUT, @col7 OUTPUT, @col8 OUTPUT, @col9 OUTPUT, @overflow OUTPUT
INSERT INTO #tmptable
(pk, originalrequest, col0, col1, col2, col3, col4, col5, col6, col7, col8, col9, overflow)
VALUES
(@pk, @list, @col0, @col1, @col2, @col3, @col4, @col5, @col6, @col7, @col8, @col9, @overflow)
FETCH NEXT FROM @cursor
INTO @pk, @list
END
CLOSE @cursor
DEALLOCATE @cursor
SELECT * FROM #tmptable
DROP TABLE #tmptable
[/font]
my data splitting function
[font="Courier New"]CREATE PROCEDURE dbo.fn_List2Columns
@list as nvarchar(1024),
-- delimiter hardcoded as 1 character long
@delimiter as nchar(1),
@col0 nvarchar(32) OUTPUT,
@col1 nvarchar(32) OUTPUT,
@col2 nvarchar(32) OUTPUT,
@col3 nvarchar(32) OUTPUT,
@col4 nvarchar(32) OUTPUT,
@col5 nvarchar(32) OUTPUT,
@col6 nvarchar(32) OUTPUT,
@col7 nvarchar(32) OUTPUT,
@col8 nvarchar(32) OUTPUT,
@col9 nvarchar(32) OUTPUT,
@overflow bit OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE
@pos int,
@numfound int,
@val nvarchar(32),
@finished bit
SET @numfound = 0
SET @finished = 0
SET @overflow = 0
SET @col0 = null
SET @col1 = null
SET @col2 = null
SET @col3 = null
SET @col4 = null
SET @col5 = null
SET @col6 = null
SET @col7 = null
SET @col8 = null
SET @col9 = null
-- get first position
SET @pos = CHARINDEX(@delimiter, @list)
WHILE (@finished = 0)
BEGIN
IF (@pos = 0)
BEGIN
-- account for no closing delimiter at end of string, use all remaining characters. For an empty string, the @val check below will trap this
SET @val = @list
SET @finished = 1
END
ELSE
BEGIN
-- grab our next value (remembering to not include any delimiter)
SET @val = LEFT(@list, @pos - 1)
-- remove previous value
SET @list = STUFF(@list, 1, @pos, Null)
END
IF (@val <> '')
BEGIN
IF (@numfound = 0) SET @col0 = @val
ELSE IF (@numfound = 1) SET @col1 = @val
ELSE IF (@numfound = 2) SET @col2 = @val
ELSE IF (@numfound = 3) SET @col3 = @val
ELSE IF (@numfound = 4) SET @col4 = @val
ELSE IF (@numfound = 5) SET @col5 = @val
ELSE IF (@numfound = 6) SET @col6 = @val
ELSE IF (@numfound = 7) SET @col7 = @val
ELSE IF (@numfound = 8) SET @col8 = @val
ELSE IF (@numfound = 9) SET @col9 = @val
ELSE SET @overflow = 1
SET @numfound = @numfound + 1
END
-- grab our next position
SET @pos = CHARINDEX(@delimiter, @list)
END
END[/font]
October 25, 2007 at 6:12 am
Such a BIGGGGGGGG batch process kind of stuff...try out this link!!!!!!!!!!
msdn2.microsoft.com/library/aa372021.aspx
October 25, 2007 at 6:23 am
HI there
Is this the kind of thing you looking for?
--http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=76033
SET NOCOUNT ON
DECLARE @VvcAgencyName VARCHAR(max)
SELECT @VvcAgencyName = 'a b c d e f'
CREATE TABLE #Numbers
(
[number] [int],
CONSTRAINT [Index_Numbers] PRIMARY KEY CLUSTERED
([number] ASC) ON [PRIMARY]
) ON [PRIMARY]
DECLARE @ii INT
SELECT @ii=1
WHILE (@ii<=1000)
BEGIN
INSERT INTO #Numbers(NUMBER)
SELECT @II
SET @II=@II+1
END;
--SET @VvcAgencyName = ',' + @VvcAgencyName
with Result(ROWID,Col)
AS
(
SELECT
ROW_NUMBER() OVER( ORDER BY Number)
,SUBSTRING(@VvcAgencyName+' ', number,
CHARINDEX(' ', @VvcAgencyName+' ', number) - number)
FROM #Numbers
WHERE number <= LEN(@VvcAgencyName)
AND SUBSTRING(' ' + @VvcAgencyName,
number, 1) = ' ')
SELECT
*
FROM Result
PIVOT(MAX(COL)
FOR ROWID in ( [1],[2],[3],[4],[5],[6] )
) as pv
--DROP THE NUMBERS TABLE
DROP TABLE #Numbers
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
October 25, 2007 at 6:43 am
many thanks guys, I will read and digest!
martin 🙂
October 25, 2007 at 7:16 am
HI All,
Here is another way to create a numbers table.
Prob a good Idea to have a perm one in your db.
SELECT TOP 100000
IDENTITY(INT,1,1) AS N
INTO dbo.Numbers
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
thanks
Chris
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply