January 29, 2013 at 10:44 am
I have the following sample data (original and desired results)
with cteOriginal (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all
select 'TAXI GIL PIERRE', NULL, NULL, NULL union all
select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all
select 'S P S S', NULL, NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all
select 'A A M RODERKERKEN', NULL, NULL, NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL)
select * from cteOriginal
;with cteResult (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', 'AVHS', 'ALQUILER', 'Y' union all
select 'TAXI GIL PIERRE', 'TAXI', 'GIL', 'PIERRE' union all
select 'DE GOUDSMID M HEIJKOOP', 'DE', 'GOUDSMID', 'M' union all
select 'S P S S', 'SPSS', NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', 'BOUCHERIE', 'CHARCUTERIE', 'ST' union all
select 'A A M RODERKERKEN', 'AAM', 'RODERKERKEN', NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', 'O', 'MUNDO', 'E')
select * from cteResult
I have functions that extract first and second words but this isn't working for records like number 4.
Any clever suggestions?
Thanks.
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
January 30, 2013 at 1:31 am
Not exactly clever nor pretty but this might get you close:
with cteOriginal (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all
select 'TAXI GIL PIERRE', NULL, NULL, NULL union all
select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all
select 'S P S S', NULL, NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all
select 'A A M RODERKERKEN', NULL, NULL, NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL)
SELECT CompanyName, CompName1, CompName2
,CompName3=LTRIM(LEFT(CompName3, CHARINDEX(' ', CompName3 + ' ')))
FROM (
SELECT CompanyName, CompName1
,CompName2=LTRIM(LEFT(CompName2, CHARINDEX(' ', CompName2 + ' ')))
,CompName3=LTRIM(RIGHT(CompName2, LEN(CompName2)-LEN(LEFT(CompName2, CHARINDEX(' ', CompName2 + ' ')))))
FROM (
SELECT CompanyName, CompName1=REPLACE(CompName1, ' ', '')
,CompName2=LTRIM(SUBSTRING(CompanyName, DATALENGTH(CompName1)+1, LEN(CompanyName)))
FROM (
SELECT CompName1=
CASE WHEN PATINDEX('[A-Z][ ][A-Z][ ][A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 8)
WHEN PATINDEX('[A-Z][ ][A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 6)
WHEN PATINDEX('[A-Z][ ][A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 4)
WHEN PATINDEX('[A-Z][ ]%', CompanyName + ' ') = 1
THEN LEFT(CompanyName, 2)
ELSE LEFT(CompanyName, CHARINDEX(' ', CompanyName)-1) END
,CompanyName
FROM cteOriginal) a) a) a
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 30, 2013 at 3:44 am
Thanks dwain.
You should see what I came up with lol... ii's shockingly bad compared to yours.
Mine has so many IF statments lol!
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
January 30, 2013 at 6:10 am
I should have been clearer with my problem definition!
The rules are as follows:
I have a string = 'A B C D EFG ProperWord QR S T'
I want to generate a comma delimited 3 part string with the following rules:
1) Part1 = Concatenate all single chacaters in the string until you reach part of string where two or more characters are joined
So based on the above string Part1 would be ABCD
2) Part2 = EFG
3) Part3 = ProperWord
This is some of the code I'be writing but I just can't get this to work well!
ALTER FUNCTION dbo.fn_NormaliseOrgName(@OrgName NVARCHAR(200))
RETURNS NVARCHAR(200)
AS
BEGIN
declare @OrgName NVARCHAR(200) = 'RA M I R A DI PETTI C & C SAS'
DECLARE @Part1 NVARCHAR(200) = ''
DECLARE @Part2 NVARCHAR(200) = ''
DECLARE @Part3 NVARCHAR(200) = ''
IF PATINDEX('%[A-Z][A-Z]%', @OrgName) > 1
BEGIN
SET @Part1 = LEFT(@OrgName, PATINDEX('%[A-Z][A-Z]%', @OrgName)-1)
IF LTRIM(RTRIM(REPLACE(@OrgName, @Part1, '') = ''
BEGIN
RETURN @Part1 + ',' + @Part2 + ',' + @Part3
END
ELSE
select @Part1 = case when PATINDEX('%[A-Z][A-Z]%', @OrgName) > 1
then left(@OrgName, PATINDEX('%[A-Z][A-Z]%', @OrgName)-1)
else LEFT(@OrgName, patindex('%[ ]%', @OrgName + ' ')-1) end
select @part1
select @Part2 = substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))
select @Part2
--select @Part1
select @Part2 = left( ltrim(substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))), charindex(' ', ltrim(substring(@OrgName, len(@Part1)+1 , LEN(@OrgName))))-1)
select @Part2
--RETURN @Part1
RETURN @Part2
--RETURN @Part3
END
select dbo.fn_NormaliseOrgName('R A M I R A DI PETTI C & C SAS')
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
January 30, 2013 at 5:23 pm
Abu,
I think my code will handle the sample you provided in your text, albeit it doesn't join the 3 parts into a comma delimited string (but that's easy).
On the other hand, I am unsure of how to handle the example in your code:
R A M I R A DI PETTI C & C SAS
If your intent is to combine to RAMIRA, you could do that by introducing two additional WHEN clauses at the top of my CASE on PATINDEX.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 30, 2013 at 8:00 pm
Hi
This isn't exactly pretty ... but it's another way to skin the cat
;with cteOriginal (CompanyName, CompName1, CompName2, CompName3)
as
(
select 'A V H S L ALQUILER Y VENTA DE HOGARES', NULL, NULL, NULL union all
select 'TAXI GIL PIERRE', NULL, NULL, NULL union all
select 'DE GOUDSMID M HEIJKOOP', NULL, NULL, NULL union all
select 'S P S S', NULL, NULL, NULL union all
select 'BOUCHERIE CHARCUTERIE ST HENRI', NULL, NULL, NULL union all
select 'A A M RODERKERKEN', NULL, NULL, NULL union all
select 'O MUNDO E A NOSSA CASA - C H E', NULL, NULL, NULL)
-- Compress up the first single letters in the string
,preprocess as (
select CompanyName
,stuff(
CompanyName
,1
,isnull(nullif(patindex('% [a-z][a-z]%',substring(companyname,1,8)),0),isnull(nullif(charindex(' ',CompanyName,7),0),999))
,replace(substring(companyname,1,isnull(nullif(patindex('% [a-z][a-z]%',substring(companyname,1,8)),0),isnull(nullif(charindex(' ',CompanyName,7),0),999))),' ','') + ' '
) StuffedCompanyName
from cteOriginal
)
-- extract each company name in turn.
,r1 as (
select CompanyName
,rtrim(substring(StuffedCompanyName,1,isnull(nullif(charindex(' ',StuffedCompanyName),0),999))) Company1
,ltrim(substring(StuffedCompanyName,isnull(nullif(charindex(' ',StuffedCompanyName),0),999),999)) leftover
from preprocess
)
,r2 as (
select CompanyName
,Company1
,nullif(rtrim(substring(leftover,1,isnull(nullif(charindex(' ',leftover),0),999))),'') Company2
,ltrim(substring(leftover,isnull(nullif(charindex(' ',leftover),0),999),999)) leftover
from r1
)
,r3 as (
select CompanyName
,Company1
,Company2
,nullif(rtrim(substring(leftover,1,isnull(nullif(charindex(' ',leftover),0),999))),'') Company3
from r2
)
select * from r3
Micky
January 31, 2013 at 7:13 am
mickyT thanks so much for this! I forgot to thank you in my other thread!
Your help is much appreciated!
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
January 31, 2013 at 11:10 am
No problem and you're welcome
February 1, 2013 at 9:03 pm
It took a little effort, but here's a replace function that will do the job and more. Won't swear it can't be improved upon or what kind of performance curve it may have. For this small example it's quick. But consider it nothing more than a first draft. If you have thoughts of using it in a production environment you should certainly do some stress testing since it's a scalar function. If you are going to use it (for example) to just import data once a day then it should work just fine. I just wouldn't stick it in the SELECT statement of a heavily-used query in production without testing it thoroughly.
EDIT: I read your notes above after posting this. The function should work fine as it is, but you might need to rearrange a few columns in the pivot table query to get exactly what you need.
CREATE FUNCTION dbo.svfSplitReplaceByDelimiter
(
@STR VARCHAR(8000)
,@delimiter CHAR(1)
,@subchar1 CHAR(1) = '@'
,@subchar2 CHAR(1) = '~'
,@replaceval VARCHAR(50) = '|'
)
RETURNS VARCHAR(MAX)
AS
BEGIN
/*
Parameters:
@STR VARCHAR(8000) -- the input string
,@delimiter CHAR(1) -- the delimiter to use for the split
,@subchar1 CHAR(1) = '@' -- pick a character not in the input string (internal use)
,@subchar2 CHAR(1) = '~' -- pick a character not in the input string (internal use)
,@replaceval VARCHAR(50) = '|' -- the delimiter used for the final array; can be up to 50 chars
per tag and can be useful for building XML or HTML strings
*/
DECLARE
@pos INT
,@len INT
,@strRep VARCHAR(8000)
,@strTemp VARCHAR(8000)
,@strRFrag VARCHAR(50)
,@strOut VARCHAR(8000)
,@counter INT
SET @STR = RTRIM(LTRIM(@str))
SET @strRep = REPLACE(@str,@delimiter,@subchar1)
SET @strTemp = @strRep
SET @strOut = ''
--get the last chunk for use later
SET @pos = CHARINDEX(@subchar1,REVERSE(@strTemp))-1
SET @strRFrag = RIGHT(@strTemp,@pos)
--start looking for single characters
--and if found aggregate them
SET @counter = 1
SET @pos = CHARINDEX(@subchar1,@strRep)
SET @len = LEN(@strTemp)
WHILE @counter <= @len
BEGIN
SET @strTemp = LEFT(@strRep,@pos)
IF RIGHT(@strTemp,1) = @subchar1
BEGIN
IF @pos = 2
SET @strTemp = REPLACE(@strTemp,@subchar1,@subchar2)
ELSE
SET @strTemp = REPLACE(@strTemp,@subchar1,@delimiter)
IF @pos > 2
SET @strOut = @strOut + @delimiter + @strTemp
ELSE
SET @strOut = @strOut + @strTemp
IF RIGHT(@strOut,1) = @subchar2
SET @strOut = LEFT(@strOut,LEN(@strOut)-1)
SET @pos = CHARINDEX(@subchar1,@strRep)
SET @strRep = RIGHT(@strRep,LEN(@strRep)-@pos)
END
SET @pos = CHARINDEX(@subchar1,@strRep)
SET @counter = @counter + 1
END
IF LEN(@strRFrag) = 1
SET @strOut = @strOut + @strRFrag
ELSE
SET @strOut = @strOut + @delimiter + @strRFrag
SET @strOut = REPLACE(REPLACE(@strOut,@delimiter+@delimiter,@delimiter),' ',' ')
SET @strOut = REPLACE(REPLACE(@strOut,@delimiter,@replaceval),' ',' ')
IF LEFT(@strOut,1) = @delimiter
SET @strOut = RIGHT(@strOut,LEN(@strOut)-1)
IF RIGHT(@strOut,1) = @delimiter
SET @strOut = LEFT(@strOut,LEN(@strOut)-1)
SET @strOut = RTRIM(LTRIM(@strOut))
RETURN @strOut
END
GO
Now load the sample data and test the function. (Requires the function DelimitedSplit8K which can be found by searching this site. It's been posted so many times and it's such an indispensable function that I figure most people already have it.)
IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable
CREATE TABLE #TempTable (
[ID] INT IDENTITY(1,1) NOT NULL,
[OrigCompanyName] NVARCHAR(150) NULL,
[CompanyName] NVARCHAR(150) NULL,
PRIMARY KEY (ID))
;WITH cteOriginal(CompanyName,CompName1,CompName2,CompName3)
AS (
SELECT
'A V H S L ALQUILER Y VENTA DE HOGARES'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'TAXI GIL PIERRE'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'DE GOUDSMID M HEIJKOOP'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'S P S S'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'BOUCHERIE CHARCUTERIE ST HENRI'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'A A M RODERKERKEN'
,NULL
,NULL
,NULL
UNION ALL
SELECT
'O MUNDO E A NOSSA CASA - C H E'
,NULL
,NULL
,NULL
)
INSERT INTO #TempTable
SELECT
CompanyName AS OrigCompanyName
,(SELECT dbo.svfSplitReplaceByDelimiter(cte.CompanyName,' ','@','~','|')) AS CompanyName
FROM
cteOriginal AS cte
--get rid of stray leading and trailing characters
UPDATE #TempTable
SET CompanyName =
(CASE
WHEN LEFT(CompanyName,1) = '|' THEN RIGHT(CompanyName,LEN(CompanyName)-1)
WHEN RIGHT(CompanyName,1) = '|' THEN LEFT(CompanyName,LEN(CompanyName)-1)
ELSE CompanyName
END)
--create the pivot table result
SELECT
r1.ID
,(SELECT OrigCompanyName FROM #TempTable WHERE ID = r1.ID) AS CompanyName
,r1.Part1
,r1.Part2
,r1.Part3
,r1.Part4
,r1.Part5
,r1.Part6
FROM
(
SELECT
r.ID
,MAX(CASE WHEN r.ItemNumber = 1 THEN r.Item END) AS Part1
,MAX(CASE WHEN r.ItemNumber = 2 THEN r.Item END) AS Part2
,MAX(CASE WHEN r.ItemNumber = 3 THEN r.Item END) AS Part3
,MAX(CASE WHEN r.ItemNumber = 4 THEN r.Item END) AS Part4
,MAX(CASE WHEN r.ItemNumber = 5 THEN r.Item END) AS Part5
,MAX(CASE WHEN r.ItemNumber = 6 THEN r.Item END) AS Part6
FROM
(
SELECT
ID
,ItemNumber
,Item
FROM
#TempTable AS tt
CROSS APPLY
dbo.DelimitedSplit8K(tt.CompanyName,'|') AS dsk1
WHERE
dsk1.Item <> ''
) r
GROUP BY
r.ID
) r1
The output:
IDCompanyNamePart1Part2Part3Part4Part5Part6
1A V H S L ALQUILER Y VENTA DE HOGARESAVHSLALQUILERYVENTADEHOGARES
2TAXI GIL PIERRETAXIGILPIERRENULLNULLNULL
3DE GOUDSMID M HEIJKOOPDEGOUDSMIDMHEIJKOOPNULLNULL
4S P S SSPSSNULLNULLNULLNULLNULL
5BOUCHERIE CHARCUTERIE ST HENRIBOUCHERIECHARCUTERIESTHENRINULLNULL
6A A M RODERKERKENAAMRODERKERKENNULLNULLNULLNULL
7O MUNDO E A NOSSA CASA - C H EOMUNDOEANOSSACASA-CHE
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply