January 30, 2019 at 7:44 am
Hi
I have values like this in a column which basically is the exact same thing if sorted .
ORM;PS;SUP
ORM;SUP;PS
I want to have it as follows : ORM;
PS;SUP
CREATE TABLE #TEST (ID INT, Column1 VARCHAR(20))
INSERT INTO #TEST ( ID,Column1)
SELECT 1,'ORM;PS;SUP' UNION
SELECT 2,'ORM;SUP;PS'
SELECT * FROM #TEST
DROP TABLE #TEST
January 30, 2019 at 8:25 am
Use a TVF like DelimitedSplit8K to break out the pieces, sort them then reassemble using STUFF.
January 30, 2019 at 8:26 am
You can use SQL Server 2016's STRING_SPLIT function to split the column by semi-colon into multiple rows.
You can then put the string back together in alphabetic order using FOR xml.
January 30, 2019 at 8:45 am
ok. That's what I did. Changed them back to individual rows . Then back to comma separated using STUFF .
Thanks
January 30, 2019 at 9:36 am
(Shows how much I understand about FOR XML... is there a good book on the bare necessities you need to understand about XML in SQL Server?)
January 30, 2019 at 9:56 am
pietlinden - Wednesday, January 30, 2019 9:36 AM(Shows how much I understand about FOR XML... is there a good book on the bare necessities you need to understand about XML in SQL Server?)
I'm not sure about a book, but this could help for basics. https://www.red-gate.com/simple-talk/sql/database-administration/manipulating-xml-data-in-sql-server/
January 30, 2019 at 10:15 am
I've just created a function that will sort a csv column alphabetically:CREATE FUNCTION dbo.SortAlphabetic
(
@Text nvarchar(4000),
@Delimiter nvarchar(10) = ';'
)
RETURNS nvarchar(4000)
AS
BEGIN
SET @Text=@Delimiter+@Text+@Delimiter
DECLARE @ResultVar nvarchar(4000)
;WITH A(A) AS (SELECT '' FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) T(C))
,B AS (SELECT TOP(LEN(@Text)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N FROM A A,A B,A C)
,C AS (SELECT SUBSTRING(@Text, B.N, LEN(@Delimiter)) delim, B.N FROM B WHERE SUBSTRING(@Text, B.N, LEN(@Delimiter)) = @Delimiter)
,D AS (SELECT LAG(C.N) OVER (ORDER BY C.N) LagN, C.N FROM C)
,E AS (SELECT TOP(1000000000000) SUBSTRING(@Text,D.LagN+LEN(@Delimiter),N-D.LagN-LEN(@Delimiter)) x,* FROM D WHERE D.LagN IS NOT NULL ORDER BY x ASC)
SELECT @ResultVar =STUFF((SELECT @Delimiter + x
FROM E
FOR XML PATH(''),TYPE).value('.','NVARCHAR(4000)'),1,LEN(@Delimiter),'')
RETURN @ResultVar
END
GO
So from your table you could run:SELECT dbo.SortAlphabetic(Column1,';') Sorted, *
FROM #TEST
January 30, 2019 at 11:18 am
This could be a faster alternative if we consider the measurements used on the performance testings for [DelimitedSplit8K] and [DelimitedSplit8K_Lead].
CREATE FUNCTION [dbo].[SortDelimitedStringN4K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table” produces values from 0 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
)
--===== Do the actual split, sort and concatenate again. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT STUFF( (SELECT @pDelimiter + SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,4000))
FROM cteStart s
ORDER BY 1
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '') AS SortedString
;
GO
Note that the way of calling it is a bit different
SELECT sds.SortedString, *
FROM #TEST
CROSS APPLY [dbo].[SortDelimitedStringN4K](Column1,';') sds
January 30, 2019 at 11:38 am
This topic comes up so much, I think MS should "fudge" a little on the relational model and create a type of "char array". Then include not just selection against it, but order by also.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
January 30, 2019 at 11:41 am
ScottPletcher - Wednesday, January 30, 2019 11:38 AMThis topic comes up so much, I think MS should "fudge" a little on the relational model and create a type of "char array". Then include not just selection against it, but order by also.
Or people should just follow 1NF and stop storing these values. Ok, I'm asking too much.
January 30, 2019 at 12:25 pm
Luis Cazares - Wednesday, January 30, 2019 11:41 AMScottPletcher - Wednesday, January 30, 2019 11:38 AMThis topic comes up so much, I think MS should "fudge" a little on the relational model and create a type of "char array". Then include not just selection against it, but order by also.Or people should just follow 1NF and stop storing these values. Ok, I'm asking too much.
It's not just storage though, people insist on using these types of lists for all sorts of tasks/processing, partly because such lists are very helpful for humans. Since that usage is never going to stop, MS might as well fully embrace it. I mean, STRING_SPLIT itself is an acknowledgment of the fact that non-normalized data such as this is used very frequently in SQL.
I keep wishing that developers would allow someone to do actual data normalization before they just slap all the columns together and slap an identity column on them, but that's not going to happen either. And overall, that's vastly more damaging than lists of string values.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
January 30, 2019 at 12:50 pm
Luis Cazares - Wednesday, January 30, 2019 11:18 AMThis could be a faster alternative if we consider the measurements used on the performance testings for [DelimitedSplit8K] and [DelimitedSplit8K_Lead].
CREATE FUNCTION [dbo].[SortDelimitedStringN4K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table†produces values from 0 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
)
--===== Do the actual split, sort and concatenate again. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT STUFF( (SELECT @pDelimiter + SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,4000))
FROM cteStart s
ORDER BY 1
FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '') AS SortedString
;
GO
Note that the way of calling it is a bit different
SELECT sds.SortedString, *
FROM #TEST
CROSS APPLY [dbo].[SortDelimitedStringN4K](Column1,';') sds
Yes, an Inline Table-valued Function is much faster than a Scalar-valued Function for this.
February 1, 2019 at 1:46 pm
>> I have values like this in a column which basically is the exact same thing if sorted .
ORM;PS;SUP
ORM;SUP;PS
I want to have it as follows : ORM;PS;SUP <<
Please read any book on relational database theory and SQL. Pay attention to the section on First Normal Form(1NF). The whole relational model is based on scaler values! That means no lists, no arrays, no queues, etc. then on top of not understanding the very foundations of the tool you're trying to use, you post a completely incorrect DDL. Did you know that by definition, a table must have a key? You also did not do two column table correctly. Since both of your columns are nullable there is no way he could ever have a key. There is no such thing as a generic "id"; by the law of identity, it has to be the identifier of something in particular. And identifiers can never be numeric because you don't do any math on them.
CREATE TABLE Foobar
(foo_id CHAR(5) NOT NULL PRIMARY KEY,
non_relational_list VARCHAR(20) NOT NULL);
The insert statement you used at the old Sybase "select-union" syntax which nobody else uses, and it's not part of the ANSI/ISO standards. The ANSI/ISO standard has been this table constructor syntax, with a values clause:
INSERT INTO Foobar
VALUES
('001' ,'ORM;PS;SUP'),
('002', ,'ORM;SUP;PS');
But more than that, why do you think these two lists are equal? You are defining your own language, and there's no reason to call them equal. What you want to use is something like XML, a Non-First Normal Form database (NFNF) or a multivalue database.
The insert statement you used has the old Sybase "select-union" syntax which nobody else uses, and it's not part of the ANSI/ISO standards. The ANSI/ISO standard has been this table constructor syntax, with a values clause:
:INSERT INTO Foobar VALUES('001' ,'ORM;PS;PS;SUP;SUP'),('002', 'ORM;SUP;SUP;PS;PS');
But more than that, why do you think these two are equal? You are defining your own language, and there's no reason to call them equal. What you want to use is something like XML, a Non-First Normal Form database (NFNF) a multivalue database.
Please post DDL and follow ANSI/ISO standards when asking for help.
February 3, 2019 at 8:08 pm
jcelko212 32090 - Friday, February 1, 2019 1:46 PMAnd identifiers can never be numeric because you don't do any math on them.
Good lord. I can't believe that your still on that kick, Joe. I also can't believe that an intelligent person would give that as a reason to not use numerics as identifiers.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 69 total)
You must be logged in to reply to this topic. Login to reply