February 6, 2013 at 8:34 am
Hi geniuses,
I need to catch some data from a column named AllLevels, that contains, in each line:
Analysis»Letters»Numbers»Detail
I want to be able to separate the data, in order to organize it this way:
LevelI LevelIILevelIIILevelIV
Analysis LettersNumbers Detail
Thanks in advance.
Regards
February 6, 2013 at 8:55 am
Can you post ddl and sample data. See the first link in my signature.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 6, 2013 at 9:06 am
I'm creating the query from scratch, can't really expose data.
Can you start based on that? This is mostly a query problem.
AllLevels
Analysis»Letters»Numbers»Detail
Description»Letters»Numbers»Detail
Planning»Letters»Symbols»Detail
Thanks in advance.
Regards
February 6, 2013 at 9:12 am
As always the case with matching/parsing tasks, it's helpful to actually see the real data to be able to offer a meaningful solution/advise.
If you cannot expose sensitive data, can you make an effort and generate a couple [or more] of bogus records that are similar to the original ones in complexity and pattern?
February 6, 2013 at 9:17 am
are there deeper than 4 levels in the "real" data? are there always exactly four levels?
the right solution depends a lot on some assumptions we can make with the data
here's just one example, using PARSENAME, which expects a maximum of four sections:
With MySampleData (AllLevels)
AS
(
SELECT 'Analysis»Letters»Numbers»Detail' UNION ALL
SELECT 'Description»Letters»Numbers»Detail' UNION ALL
SELECT 'Planning»Letters»Symbols»Detail'
)
SELECT
PARSENAME(REPLACE(AllLevels,'»','.'),4) As Level1,
PARSENAME(REPLACE(AllLevels,'»','.'),3) As Level2,
PARSENAME(REPLACE(AllLevels,'»','.'),2) As Level3,
PARSENAME(REPLACE(AllLevels,'»','.'),1) As Level4
FROM MySampleData
other possibilities are using a custom function to do the same, or the DelimitedSplit8K function and a PIVOT to get the data in the desired format.
here, a scalar function seems to be an easier implementation for me, i think.
Lowell
February 6, 2013 at 9:23 am
You know I was about to reply with a query to parse the string out but then noticed this in Sean's signature.
Need to split a string? Try Jeff Moden's splitter[/url].
I would read that and try out the function he has near the bottom. It will split your string into multiple rows. Use cross apply to join it to your table, then you can group the different rows back together to give you your columns.
Or you can write some parsing code using LEFT, SUBSTRING and CHARINDEX. Even assuming that all 4 levels will always be there it isn't exactly trivial (at least not for me.)
Here is a start to it. You can see using Jeff's splitter might be easier though.
declare @temp varchar(250)
SET @temp = 'Analysis»Letters»Numbers»Detail'
SELECT LEFT(@temp,charindex('»',@temp)-1),
SUBSTRING(@temp, charindex('»',@temp)+1, charindex('»',@temp,charindex('»',@temp)+1)-charindex('»',@temp)-1)
Kenneth FisherI was once offered a wizards hat but it got in the way of my dunce cap.--------------------------------------------------------------------------------For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]Link to my Blog Post --> www.SQLStudies.com[/url]
February 6, 2013 at 9:28 am
It is very easy to create a small script that creates the table and insert the data into the table. Since it does save some time from anyone that tries to help you, you should include in your question such a script. Here is a small script that creates the table and inserts data:
create table Demo (vc varchar(200))
go
insert into Demo (vc) values ('Analysis»Letters»Numbers»Detail')
insert into Demo (vc) values ('Description»Letters»Numbers»Detail')
insert into Demo (vc) values ('Planning»Letters»Symbols»Detail')
If you don't have dots inside the strings, you can use the function parsename. This function is intended to parse full name of objects that are referenced in the database and give the name of the server or database or schema or object. Since those parts are separated by period, the code bellow will work if there is no period inside your strings. If you can have periods inside the strings, you'll have to modify it.
select parsename(replace(vc,'»','.'),4), parsename(replace(vc,'»','.'),3), parsename(replace(vc,'»','.'),2), parsename(replace(vc,'»','.'),1)
from Demo
Adi
--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
February 6, 2013 at 10:06 am
Thanks Guys.
Lowell, yes there are more levels indeed.
Regards
February 6, 2013 at 10:09 am
davdam8 (2/6/2013)
Thanks Guys.Lowell, yes there are more levels indeed.
Regards
Then I would suggest you look at the article in my signature about splitting strings. Then look at the articles in my signature about cross tabs and pivots. The techniques found in those will get what you need.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 7, 2013 at 2:59 am
I worked it out. This is the query that would get you the results you want:
--Creating Table
Create Table Ex1
(
Id Int Identity(1,1),
AllLevels Varchar(MAX)
)
--Inserting Sample Data
Insert Into Ex1
Select 'Analysis»Letters»Numbers»Detail'
Union ALL
Select 'Description»Letters»Numbers»Detail'
Union ALL
Select 'Planning»Letters»Symbols»Detail'
--Query for your requirement
Select Id,
MAX(Case When rn = 'Level1' Then df Else '' End) As Level1,
MAX(Case When rn = 'Level2' Then df Else '' End) As Level2,
MAX(Case When rn = 'Level3' Then df Else '' End) As Level3,
MAX(Case When rn = 'Level4' Then df Else '' End) As Level4
From
(
Select Id, AllLevels, a.df, 'Level' + Cast(ROW_NUMBER() Over(Partition By Id Order By Id) As Varchar) As rn From Ex1 As b
Cross Apply
(
SELECT SUBSTRING('»' + b.AllLevels + '»',N+1,CHARINDEX('»','»' + b.AllLevels + '»',N+1)-N-1) As df
FROM dbo.Tally As a
WHERE a.N < LEN('»' + b.AllLevels + '»')
AND SUBSTRING('»' + b.AllLevels + '»',N,1) = '»'
) a
) As p
Group By Id
But, it takes a 8 looooong seconds to get the result for this teeeny sample data. So, for tables with huge amounts of data this would definitely not be a good one.
This means that your design is not good. Inserting Delimited strings and pivoting them is not what SQL Server is meant to do.
You should rather do it with a reporting tool.
This is what I could come up with...may be someone might come up with something better.
February 7, 2013 at 3:30 am
You can switch the Cross Apply in the query with an Outer Apply, that would reduce the time by more than 70%, as follows :
Select Id,
MAX(Case When rn = 'Level1' Then df Else '' End) As Level1,
MAX(Case When rn = 'Level2' Then df Else '' End) As Level2,
MAX(Case When rn = 'Level3' Then df Else '' End) As Level3,
MAX(Case When rn = 'Level4' Then df Else '' End) As Level4
From
(
Select Id, AllLevels, a.df, 'Level' + Cast(ROW_NUMBER() Over(Partition By Id Order By Id) As Varchar) As rn From Ex1 As b
Outer Apply
(
SELECT SUBSTRING('»' + b.AllLevels + '»',N+1,CHARINDEX('»','»' + b.AllLevels + '»',N+1)-N-1) As df
FROM dbo.Tally As a
WHERE a.N < LEN('»' + b.AllLevels + '»')
AND SUBSTRING('»' + b.AllLevels + '»',N,1) = '»'
) a
) As p
Group By Id
February 7, 2013 at 3:44 am
Even Better.......if you create a Non Clustered index On the Tally table then you get the results even fatser.
Create NonClustered Index NCLIX_Tally On Tally(n)
Select Id,
MAX(Case When rn = 'Level1' Then df Else '' End) As Level1,
MAX(Case When rn = 'Level2' Then df Else '' End) As Level2,
MAX(Case When rn = 'Level3' Then df Else '' End) As Level3,
MAX(Case When rn = 'Level4' Then df Else '' End) As Level4
From
(
Select Id, AllLevels, a.df, 'Level' + Cast(ROW_NUMBER() Over(Partition By Id Order By Id) As Varchar) As rn From Ex1 As b
Outer Apply
(
SELECT SUBSTRING('»' + b.AllLevels + '»',N+1,CHARINDEX('»','»' + b.AllLevels + '»',N+1)-N-1) As df
FROM dbo.Tally As a
WHERE a.N < LEN('»' + b.AllLevels + '»')
AND SUBSTRING('»' + b.AllLevels + '»',N,1) = '»'
) a
) As p
Group By Id
The time and IO of the query after adding the index is as follows:
Table 'Tally'. Scan count 3, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Ex1'. Scan count 1, logical reads 1, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 119 ms.
If you don't know what a Tally table is then you need to check the following link. It shows how to create a tally Table and how to split a string using it.
Jeoff Moden's Splitter using Tally Table[/url]
Hope this helps. 🙂
February 7, 2013 at 5:43 am
ok, still sticking with a scalar function, here's a parsename enhancement I slapped together a year and a half ago; it takes an unlimited number of elements:
the original PARSENAME thinks of elements as going right to left...that is the first element is the right most;
my example is the opposite... the leftmost substring is 1, then 2 etc., which actually fits better with this requirement.
see the thread here if you want the details:
http://www.sqlservercentral.com/Forums/Topic1176642-1292-1.aspx#bm1176804
also see this thread: dwainc posted a solution using the DelimitedSplit8K solution, and his solution is roughly eight times faster on a million rows of data:
http://www.sqlservercentral.com/Forums/Topic1372793-1292-1.aspx
With MySampleData (AllLevels)
AS
(
SELECT 'Analysis»Letters»Numbers»Detail' UNION ALL
SELECT 'Description»Letters»Numbers»Detail' UNION ALL
SELECT 'Planning»Letters»Symbols»Detail' UNION ALL
SELECT 'Analysis»Letters»Numbers»Detail»SubDetail' UNION ALL
SELECT 'Description»Letters»Numbers»Detail»Version»Major' UNION ALL
SELECT 'Planning»Letters»Symbols»Detail»Version»Major»Minor»Revision»SubDetail»'
)
SELECT
dbo.fn_parsename(AllLevels,'»',1) As Level1,
dbo.fn_parsename(AllLevels,'»',2) As Level2,
dbo.fn_parsename(AllLevels,'»',3) As Level3,
dbo.fn_parsename(AllLevels,'»',4) As Level4,
dbo.fn_parsename(AllLevels,'»',5) As Level5,
dbo.fn_parsename(AllLevels,'»',6) As Level6,
dbo.fn_parsename(AllLevels,'»',7) As Level7,
dbo.fn_parsename(AllLevels,'»',8) As Level8,
dbo.fn_parsename(AllLevels,'»',9) As Level9
FROM MySampleData
and here's my version for you to enhance or change if you want:
CREATE FUNCTION dbo.fn_parsename
(
@pString VARCHAR(7999),
@pDelimiter CHAR(1),
@Occurrance int
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Results VARCHAR(8000)
--===== "Inline" CTE Driven "Tally Table” produces values up to
-- 10,000... enough to cover VARCHAR(8000)
;WITH
E1(N) AS ( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4) ,
--===== Do the split
InterResults
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item
FROM cteTally
WHERE N < LEN(@pString) + 2
AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter
)
SELECT @Results = Item FROM InterResults WHERE ItemNumber = @Occurrance
return @Results
END --FUNCTION
GO
Lowell
February 7, 2013 at 6:25 am
vinu512 (2/7/2013)
I worked it out. This is the query that would get you the results you want:
--Creating Table
Create Table Ex1
(
Id Int Identity(1,1),
AllLevels Varchar(MAX)
)
--Inserting Sample Data
Insert Into Ex1
Select 'Analysis»Letters»Numbers»Detail'
Union ALL
Select 'Description»Letters»Numbers»Detail'
Union ALL
Select 'Planning»Letters»Symbols»Detail'
--Query for your requirement
Select Id,
MAX(Case When rn = 'Level1' Then df Else '' End) As Level1,
MAX(Case When rn = 'Level2' Then df Else '' End) As Level2,
MAX(Case When rn = 'Level3' Then df Else '' End) As Level3,
MAX(Case When rn = 'Level4' Then df Else '' End) As Level4
From
(
Select Id, AllLevels, a.df, 'Level' + Cast(ROW_NUMBER() Over(Partition By Id Order By Id) As Varchar) As rn From Ex1 As b
Cross Apply
(
SELECT SUBSTRING('»' + b.AllLevels + '»',N+1,CHARINDEX('»','»' + b.AllLevels + '»',N+1)-N-1) As df
FROM dbo.Tally As a
WHERE a.N < LEN('»' + b.AllLevels + '»')
AND SUBSTRING('»' + b.AllLevels + '»',N,1) = '»'
) a
) As p
Group By Id
But, it takes a 8 looooong seconds to get the result for this teeeny sample data. So, for tables with huge amounts of data this would definitely not be a good one.
This means that your design is not good. Inserting Delimited strings and pivoting them is not what SQL Server is meant to do.
You should rather do it with a reporting tool.
This is what I could come up with...may be someone might come up with something better.
Sooooo close Vinu! Just a few little tweaks and it runs like a rocket:
--Query for your requirement
SELECT b.*, x.*
FROM #Ex1 As b
CROSS APPLY ( -- x
SELECT
MAX(Case When rn = 'Level1' Then df Else '' End) As Level1,
MAX(Case When rn = 'Level2' Then df Else '' End) As Level2,
MAX(Case When rn = 'Level3' Then df Else '' End) As Level3,
MAX(Case When rn = 'Level4' Then df Else '' End) As Level4
FROM ( -- p
SELECT
a.df,
rn = 'Level' + CAST(ROW_NUMBER() OVER(Partition By Id Order By Id) As Varchar)
FROM ( -- a
SELECT df = LEFT(b.AllLevels, CHARINDEX('»',b.AllLevels,1)-1)
UNION ALL
SELECT df = SUBSTRING(b.AllLevels,N+1,
ISNULL(NULLIF(CHARINDEX('»',b.AllLevels,n+1),0)-(n+1),8000))
FROM dbo.Tally As a
WHERE a.N < = DATALENGTH(b.AllLevels)
AND SUBSTRING(b.AllLevels,N,1) = '»'
) a
) p
) x
ORDER BY Id
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
February 7, 2013 at 7:36 am
How about using the DelimitedSplit8k? This seems to be super fast to me.
Select Id,
MAX(Case When ItemNumber = 1 Then Item Else '' End) As Level1,
MAX(Case When ItemNumber = 2 Then Item Else '' End) As Level2,
MAX(Case When ItemNumber = 3 Then Item Else '' End) As Level3,
MAX(Case When ItemNumber = 4 Then Item Else '' End) As Level4
From
(
select *
from Ex1
cross apply dbo.DelimitedSplit8K(ex1.AllLevels, '»')
) As p
Group By Id
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply