Sort already comma separated list

  • Hi

    I have values like this in a column which basically is the exact same thing if sorted .

    ORM;PS;SUP

    ORM;SUP;PS

    I want to have it as follows : ORM;

    PS;SUP

    CREATE TABLE #TEST (ID INT, Column1 VARCHAR(20))

    INSERT INTO #TEST ( ID,Column1)

    SELECT 1,'ORM;PS;SUP' UNION

    SELECT 2,'ORM;SUP;PS'

    SELECT * FROM #TEST

    DROP TABLE #TEST

  • Use a TVF like DelimitedSplit8K to break out the pieces, sort them then reassemble using STUFF.

  • You can use SQL Server 2016's STRING_SPLIT function to split the column by semi-colon into multiple rows.
    You can then put the string back together in alphabetic order using FOR xml.

  • ok. That's what I did. Changed them back to individual rows . Then back to comma separated using STUFF .

    Thanks

  • PSB - Wednesday, January 30, 2019 8:45 AM

    ok. That's what I did. Changed them back to individual rows . Then back to comma separated using STUFF .

    Thanks

    Just to clarify, STUFF doesn't concatenate rows. The concatenation is being done by FOR XML.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • (Shows how much I understand about FOR XML... is there a good book on the bare necessities you need to understand about XML in SQL Server?)

  • pietlinden - Wednesday, January 30, 2019 9:36 AM

    (Shows how much I understand about FOR XML... is there a good book on the bare necessities you need to understand about XML in SQL Server?)

    I'm not sure about a book, but this could help for basics. https://www.red-gate.com/simple-talk/sql/database-administration/manipulating-xml-data-in-sql-server/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I've just created a function that will sort a csv column alphabetically:
    CREATE FUNCTION dbo.SortAlphabetic
    (
        @Text nvarchar(4000),
        @Delimiter nvarchar(10) = ';'
    )
    RETURNS nvarchar(4000)
    AS
    BEGIN
        SET @Text=@Delimiter+@Text+@Delimiter
        DECLARE @ResultVar nvarchar(4000)
        ;WITH A(A) AS (SELECT '' FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) T(C))
        ,B AS (SELECT TOP(LEN(@Text)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N FROM A A,A B,A C)
        ,C AS (SELECT SUBSTRING(@Text, B.N, LEN(@Delimiter)) delim, B.N FROM B WHERE SUBSTRING(@Text, B.N, LEN(@Delimiter)) = @Delimiter)
        ,D AS (SELECT LAG(C.N) OVER (ORDER BY C.N) LagN, C.N FROM C)
        ,E AS (SELECT TOP(1000000000000) SUBSTRING(@Text,D.LagN+LEN(@Delimiter),N-D.LagN-LEN(@Delimiter)) x,* FROM D WHERE D.LagN IS NOT NULL ORDER BY x ASC)
        SELECT @ResultVar =STUFF((SELECT @Delimiter + x
                                    FROM E
                                     FOR XML PATH(''),TYPE).value('.','NVARCHAR(4000)'),1,LEN(@Delimiter),'')
        RETURN @ResultVar
    END
    GO

    So from your table you could run:
    SELECT dbo.SortAlphabetic(Column1,';') Sorted, *
      FROM #TEST

  • This could be a faster alternative if we consider the measurements used on the performance testings for [DelimitedSplit8K] and [DelimitedSplit8K_Lead].

    CREATE FUNCTION [dbo].[SortDelimitedStringN4K]
    --===== Define I/O parameters
       (@pString VARCHAR(8000), @pDelimiter CHAR(1))
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    --===== "Inline" CTE Driven "Tally Table” produces values from 0 up to 10,000...
      -- enough to cover VARCHAR(8000)
    WITH E1(N) AS (
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
          ),         --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
           -- for both a performance gain and prevention of accidental "overruns"
          SELECT 0 UNION ALL
          SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
          ),
    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
          SELECT t.N+1
           FROM cteTally t
          WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
          )
    --===== Do the actual split, sort and concatenate again. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
    SELECT STUFF( (SELECT @pDelimiter + SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,4000))
         FROM cteStart s
         ORDER BY 1
         FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '') AS SortedString
    ;
    GO

      

    Note that the way of calling it is a bit different

    SELECT sds.SortedString, *
    FROM #TEST
    CROSS APPLY [dbo].[SortDelimitedStringN4K](Column1,';') sds

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This topic comes up so much, I think MS should "fudge" a little on the relational model and create a type of "char array".  Then include not just selection against it, but order by also.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Wednesday, January 30, 2019 11:38 AM

    This topic comes up so much, I think MS should "fudge" a little on the relational model and create a type of "char array".  Then include not just selection against it, but order by also.

    Or people should just follow 1NF and stop storing these values. Ok, I'm asking too much.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares - Wednesday, January 30, 2019 11:41 AM

    ScottPletcher - Wednesday, January 30, 2019 11:38 AM

    This topic comes up so much, I think MS should "fudge" a little on the relational model and create a type of "char array".  Then include not just selection against it, but order by also.

    Or people should just follow 1NF and stop storing these values. Ok, I'm asking too much.

    It's not just storage though, people insist on using these types of lists for all sorts of tasks/processing, partly because such lists are very helpful for humans.  Since that usage is never going to stop, MS might as well fully embrace it.  I mean, STRING_SPLIT itself is an acknowledgment of the fact that non-normalized data such as this is used very frequently in SQL.

    I keep wishing that developers would allow someone to do actual data normalization before they just slap all the columns together and slap an identity column on them, but that's not going to happen either.  And overall, that's vastly more damaging than lists of string values.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Luis Cazares - Wednesday, January 30, 2019 11:18 AM

    This could be a faster alternative if we consider the measurements used on the performance testings for [DelimitedSplit8K] and [DelimitedSplit8K_Lead].

    CREATE FUNCTION [dbo].[SortDelimitedStringN4K]
    --===== Define I/O parameters
       (@pString VARCHAR(8000), @pDelimiter CHAR(1))
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    --===== "Inline" CTE Driven "Tally Table†produces values from 0 up to 10,000...
      -- enough to cover VARCHAR(8000)
    WITH E1(N) AS (
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
          SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
          ),         --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
           -- for both a performance gain and prevention of accidental "overruns"
          SELECT 0 UNION ALL
          SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
          ),
    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
          SELECT t.N+1
           FROM cteTally t
          WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
          )
    --===== Do the actual split, sort and concatenate again. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
    SELECT STUFF( (SELECT @pDelimiter + SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,4000))
         FROM cteStart s
         ORDER BY 1
         FOR XML PATH(''), TYPE).value('./text()[1]', 'varchar(max)'), 1, 1, '') AS SortedString
    ;
    GO

      

    Note that the way of calling it is a bit different

    SELECT sds.SortedString, *
    FROM #TEST
    CROSS APPLY [dbo].[SortDelimitedStringN4K](Column1,';') sds

    Yes, an Inline Table-valued Function is much faster than a Scalar-valued Function for this.

  • >> I have values like this in a column which basically is the exact same thing if sorted .
    ORM;PS;SUP
    ORM;SUP;PS
    I want to have it as follows : ORM;PS;SUP <<

    Please read any book on relational database theory and SQL. Pay attention to the section on First Normal Form(1NF). The whole relational model is based on scaler values! That means no lists, no arrays, no queues, etc. then on top of not understanding the very foundations of the tool you're trying to use, you post a completely incorrect DDL. Did you know that by definition, a table must have a key? You also did not do two column table correctly. Since both of your columns are nullable there is no way he could ever have a key. There is no such thing as a generic "id"; by the law of identity, it has to be the identifier of something in particular. And identifiers can never be numeric because you don't do any math on them.

    CREATE TABLE Foobar
    (foo_id CHAR(5) NOT NULL PRIMARY KEY,
    non_relational_list VARCHAR(20) NOT NULL);

    The insert statement you used at the old Sybase "select-union" syntax which nobody else uses, and it's not part of the ANSI/ISO standards. The ANSI/ISO standard has been this table constructor syntax, with a values clause:

    INSERT INTO Foobar
    VALUES
    ('001' ,'ORM;PS;SUP'),
    ('002', ,'ORM;SUP;PS');

    But more than that, why do you think these two lists are equal? You are defining your own language, and there's no reason to call them equal. What you want to use is something like XML, a Non-First Normal Form database (NFNF)  or a multivalue database.

    The insert statement you used has the old Sybase "select-union" syntax which nobody else uses, and it's not part of the ANSI/ISO standards. The ANSI/ISO standard has been this table constructor syntax, with a values clause:
    :INSERT INTO Foobar VALUES('001' ,'ORM;PS;PS;SUP;SUP'),('002', 'ORM;SUP;SUP;PS;PS');

    But more than that, why do you think these two are equal? You are defining your own language, and there's no reason to call them equal. What you want to use is something like XML, a Non-First Normal Form database (NFNF) a multivalue database.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Friday, February 1, 2019 1:46 PM

    And identifiers can never be numeric because you don't do any math on them.

    Good lord.  I can't believe that your still on that kick, Joe.  I also can't believe that an intelligent person would give that as a reason to not use numerics as identifiers.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 69 total)

You must be logged in to reply to this topic. Login to reply