Pulling my hair out! Multivalued column with key/value pairs to multiple rows.

  • The following SQL pulls data from a multiple-valued column, in which the data represents key/value pairs.

     

    SELECT resultid as Key_, Pos as rank, substring(C,2,charindex(',',C,2)-2)  AS Data

    FROM (SELECT resultid, substring(CSV.CriteriaData, ITER.Pos, LEN(CSV.CriteriaData)) AS C, Pos FROM

    (SELECT ',' + REPLACE(n.criteriadata, Char(10),',') + ',' AS CriteriaData,

                studentid+'-'+moduleid+'-'+n.assessid as resultid

    FROM nstuResults n where setid ='2006-07' and examid ='Exam_Name' and linenum = 0 and n.studentid = '123456' and moduleid ='Sub_ID') CSV,

    (SELECT [id] AS Pos

    FROM Tally) ITER

    WHERE ITER.Pos <= LEN(CSV.CriteriaData)) AS X

    WHERE (((Len(C))>1) AND ((substring(C,1,1))= ','))

     

    It generates a results set like this…

     

    The rank column merely ensures the correct order is maintained.  This is important because the data actually represents key/value pairs

     

    Key_

    Rank

    Data

    123456-Sub_ID-Exam_Name

    1

    330

    123456-Sub_ID-Exam_Name

    5

    3

    123456-Sub_ID-Exam_Name

    7

    415

    123456-Sub_ID-Exam_Name

    11

    U

    123456-Sub_ID-Exam_Name

    13

    423

    123456-Sub_ID-Exam_Name

    17

    32

    123456-Sub_ID-Exam_Name

    20

    424

    123456-Sub_ID-Exam_Name

    24

    44

    123456-Sub_ID-Exam_Name

    27

    420

    123456-Sub_ID-Exam_Name

    31

    0

    123456-Sub_ID-Exam_Name

    33

    416

    123456-Sub_ID-Exam_Name

    37

    BB

     

    What I actually want is something like this.

     

    Key_

    Key

    Value

    123456-Sub_ID-Exam_Name

    330

    3

    123456-Sub_ID-Exam_Name

    415

    U

    123456-Sub_ID-Exam_Name

    423

    32

    123456-Sub_ID-Exam_Name

    424

    44

    123456-Sub_ID-Exam_Name

    420

    0

    123456-Sub_ID-Exam_Name

    416

    BB

    But have been going around a circle all afternoon.

    All suggestions welcome.

     

  • I know there might be a better way to do this but try this.

    Assumtion:KV is ur resultset that u have.

    SELECT identity(int,1,1) as Iden,* into #tmp from  KV order by Rank

    Desired Result:

    select ODD.* ,(Select Top 1 Data from #tmp EVEN where EVEN.Iden % 2 = 0 and ODD.Iden < EVEN.Iden ) 

     from #tmp ODD

    where ODD.Iden % 2 = 1

    Thanks

    Sreejith

  • Nice solution! 

    DECLARE @nstuResults TABLE( Key_ varchar(25),

                                                      Rank integer,

                                                      Data varchar(10))

    INSERT INTO @nstuResults

    SELECT '123456-Sub_ID-Exam_Name', 1, '330'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 5, '3'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 7, '415'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 11, 'U'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 13, '423'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 17, '32'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 20, '424'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 24, '44'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 27, '420'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 31, '0'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 33, '416'

    UNION ALL

    SELECT '123456-Sub_ID-Exam_Name', 37, 'BB'

    SELECT IDENTITY(int,1,1) AS Iden,* INTO #tmp FROM @nstuResults ORDER BY Rank

    SELECT ODD.*, (SELECT TOP 1 Data FROM #tmp EVEN WHERE EVEN.Iden % 2 = 0 AND ODD.Iden < EVEN.Iden ) 

    FROM #tmp ODD

    WHERE ODD.Iden % 2 = 1

    DROP TABLE #tmp

    I wasn't born stupid - I had to study.

  • Guys, thanks.  I can't test on the live db but Farrell's 'virtual data' with the solution works perfectly, a little sluggish on the machine at home but I guess it'll run fine at work.

  • I would be inclined to:

    1. add OPTION (MAXDOP 1) to Farrell's query, as it may be safer.

    2. create the temp table/variable first, to reduce schema locking.

    Also, why not use the LineFeed as the separator instead of converting it to a comma first? Something like:

    SELECT n.studentid + '-' + n.moduleid + '-' + n.assessid as resultid

     ,T.[id] AS Rank

     ,SUBSTRING(CHAR(10) + n.criteriadata + CHAR(10)

      ,T.[id] + 1

      ,CHARINDEX(CHAR(10), CHAR(10) + n.criteriadata + CHAR(10), T.[id] + 1) - T.[id] - 1) as Data

    FROM nstuResults n

     CROSS JOIN Tally T

    WHERE  T.[id] < LEN(CHAR(10) + n.criteriadata + CHAR(10)) - 1

      and SUBSTRING(CHAR(10) + n.criteriadata + CHAR(10), T.[id], 1) = CHAR(10)

      and setid = '2006-07'

      and examid = 'Exam_Name'

      and linenum = 0

      and n.studentid = '123456'

      and moduleid = 'Sub_ID'

  • Coupla questions Ken,

    What benefits does the option at point 1 above do for me?

    And, how did you go about simplifying the underlying sql statement itself.  I'm kinda learning at this game and managed to decipher the original code from the SQL Cookbook. 

    I will not actually be able to test the code until tomorrow but it looks so muche cleaner and readable.  I suspect the removal of the call to Replace() will safe some overhead which was unnecessary, too much of strictly applying what I read without thinking it through I guess.

    Cheers

  • This is were this thread has ended for anyone who was following...

    Declare

    @SetID varchar(10), @ExamID varchar(10)

    Select @Setid = '2006-07', @ExamID = 'Y11Exam'

    SELECT

    IDENTITY(int,1,1) AS Iden,*

    INTO #tmp

    FROM (SELECT t1.studentid + '-' + t1.moduleid + '-' + t1.assessid as resultid

       , n.Number AS Rank

        ,SUBSTRING(CHAR(10) + t1.criteriadata+t2.criteriadata

         + CHAR(10) , n.Number + 1 ,CHARINDEX(CHAR(10), CHAR10) 

          + t1.criteriadata+t2.criteriadata + CHAR(10), n.Number + 1) -

           n.number - 1) as Data

    FROM nstuResults t1

         left join nsturesults t2 on

              t2.linenum = 1

              and t1.setid = t2.setid

              and t1.moduleid = t2.moduleid

              and t1.groupid = t2.groupid

              and t1.examid = t2.examid

              and t1.assessid = t2.assessid

              and t1.studentid = t2.studentid

         cross join Numbers N

    WHERE N.Number < LEN(CHAR(10) + 

         t1.criteriadata+t2.criteriadata + CHAR(10)) - 1

         and SUBSTRING(CHAR(10) + t1.criteriadata+t2.criteriadata 

              + CHAR(10), n.number, 1) = CHAR(10)

         and t1.setid = @SetID

         and t1.examid = @ExamID

         and t1.linenum = 0) x

    ORDER BY ResultID, Rank

    SELECT

    ODD.ResultID, ODD.Data AS Key_Field, 

         (SELECT TOP 1 Data 

         FROM #tmp EVEN 

         WHERE EVEN.Iden % 2 =

              AND ODD.Iden < EVEN.Iden ) AS Value_Field

    FROM #tmp ODD

    WHERE ODD.Iden % 2 = 1

    DROP TABLE #tmp

    Thanks to those who offered assistance and insight!

  • 1. As you LEFT JOIN to t2, you need an ISNULL around any mention of t2.criteriadata.

    2. Can linenum ever exceed 1?

    3. Can key/value pairs ever cross a linenum boundary?

     

  • Hi Ken

    1 - I noticed these in an earlier post on another thread, can you explain their importance?  is it a case of 'x'+NULL = NULL ???

    2 - Yes it does, but only under limited circumstances, and not in this particular scenario.  In another table it goes as high as 5.

    3 - Key/value pairs do cross boundaries, this is why I need to concatenate the criteriadata from each linenum.

     

    regards

    Carl

  • 1. Yes:

    'X' + NULL = NULL

    while

    'X' + ISNULL(NULL, '') = 'X'

    or

    'X' + COALESCE(NULL, '') = 'X'

    etc

    2. OK

    3. Ouch. Bad design on top of bad design!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply