Transform Data -- Sequential to 3 Groups of Columns

  • I don't think this is a PIVOT but I still can't figure it out.

    I have data (varying numbers of employees in each of HI, MED, LOW performance).

    Like this

    DECLARE @a TABLE (

    Employee# int, EmployeeName varchar(50), Perf varchar(50), Predictor int)

    INSERT @a

    SELECT 1111, 'JOE', 'H', 95 UNION ALL

    SELECT 2222, 'MARY', 'M', 76 UNION ALL

    SELECT 3333, 'SAM', 'H', 96 UNION ALL

    SELECT 4444, 'BILL', 'L', 92 UNION ALL

    SELECT 5555, 'GARY', 'M', 77 UNION ALL

    SELECT 6666, 'JILL', 'M', 72 UNION ALL

    SELECT 7777, 'KERRY', 'M', 70 UNION ALL

    SELECT 8888, 'JUNE', 'L', 56

    What I need back is like this:

    HI MEDLO

    3333SAM965555GARY774444BILL92

    1111JOE952222MARY768888JUNE56

    6666JILL72

    7777KERRY70

    There are always three groups of columns, based on the H,M, L.

    I have tried doing this with CTE, Joining on RowID, etc. but where I'm stumped is if there are fewer of one group than other, I still need all the results (as in the example where the columns with the MED contain more employees.

    What should my T-SQL look like?

    I tried this and it cut off the results as you can see:

    ;

    --Define a CTE for each grouping

    WITH HighPerformers(RowID, Employee#, EmployeeName, Predictor) AS

    (SELECT TOP 5000 ROW_NUMBER() OVER (ORDER BY Predictor DESC) AS ROWID, Employee#, EmployeeName, Predictor FROM @a Where Perf= 'H' ORDER BY Predictor DESC),

    ModeratePerformers(RowID, Employee#, EmployeeName, Predictor) AS

    (SELECT TOP 5000 ROW_NUMBER() OVER (ORDER BY Predictor DESC) AS ROWID, Employee#, EmployeeName, Predictor FROM @a Where Perf= 'M' ORDER BY Predictor DESC) ,

    MarginalPerformers(RowID, Employee#, EmployeeName, Predictor) AS

    (SELECT TOP 5000 ROW_NUMBER() OVER (ORDER BY Predictor DESC) AS ROWID, Employee#, EmployeeName, Predictor FROM @a Where Perf= 'L' ORDER BY Predictor DESC)

    SELECT H.Employee# AS [HighPerf#], H.EmployeeName AS [High Performer], H.Predictor AS [H Predictor], M.Employee# AS [ModPerf#], M.EmployeeName AS [Moderate Performer], M.Predictor as [M Predictor], L.Employee# AS [LOWPERF#], L.EmployeeName AS [Marginal Performer], L.Predictor AS [L Predictor]

    FROM HighPerformers H

    LEFT JOIN ModeratePerformers M ON H.RowID = M.RowID

    LEFT JOIN MarginalPerformers L ON H.RowID = L.RowID

    And my results are cut off, and I don't like using the TOP command:

    HighPerf#High PerformerH PredictorModPerf#Moderate PerformerM PredictorLOWPERF#Marginal PerformerL Predictor

    3333SAM965555GARY774444BILL92

    1111JOE952222MARY768888JUNE56

    Thanks

  • It's kind of a PIVOT but this scenario is also sometimes referred to as CrossTab.

    To get the individual numbers grouped together it's better to use the PARTITON BY clause of the ROW_NUMBER function. Something like the following:

    (Side note if you don't like the display of Zero values you'd need to convert your numbers to character)

    ;WITH HighPerformers AS

    (

    SELECT

    ROW_NUMBER() OVER(PARTITION BY Perf ORDER BY Employee) AS ROW,

    Employee,

    EmployeeName,

    Perf,

    Predictor

    FROM @a

    )

    SELECT

    MAX(CASE WHEN Perf='H' THEN Employee ELSE 0 END) AS HighPerf,

    MAX(CASE WHEN Perf='H' THEN EmployeeName ELSE '' END) AS HighPerformer,

    MAX(CASE WHEN Perf='H' THEN Predictor ELSE 0 END) AS HPredictor,

    MAX(CASE WHEN Perf='M' THEN Employee ELSE 0 END) AS ModPerf,

    MAX(CASE WHEN Perf='M' THEN EmployeeName ELSE '' END) AS ModeratePerformer,

    MAX(CASE WHEN Perf='M' THEN Predictor ELSE 0 END) AS MPredictor,

    MAX(CASE WHEN Perf='L' THEN Employee ELSE 0 END) AS LOWPERF,

    MAX(CASE WHEN Perf='L' THEN EmployeeName ELSE '' END) AS MarginalPerformer,

    MAX(CASE WHEN Perf='L' THEN Predictor ELSE 0 END) AS LPredictor

    FROM HighPerformers

    GROUP BY ROW



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Very cool. I'll investigate the meaning of the MAX( ... it's not obivous why it's used there.

    However, using this with my REAL data (I posted only an example) I get a weird error:

    Conversion failed when converting the varchar value '70% of emplo' to data type int.

    and the line number points to the statement starting SELECT

    ROW_NUMBER() OVER(PARTITION BY

    I've investigated the data, and see no issues. No such item in the data. Ideas?

    And by the way, Thanks! This is an apparently simple solution to what I thought was a complex problem.

  • Oops! Looks like it was bad data....BRB Phil

  • PhilM99 (2/27/2010)


    Very cool. I'll investigate the meaning of the MAX( ... it's not obivous why it's used there.

    Its only purpose is to allow the GROUP BY to work well enough to put all of the data on the same row. Because of the nature of the query, MAX will always only work against 1 value in this query.

    You can see more of how Cross-Tabs and Pivots work. Although the examples in the article are decidedly numeric in nature, the same principles apply.

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yeah, thanks to Jeff as well.

    Looks to me like the MAX function works only against numeric values because I had to omit the columns that contain some screwy character data (e.g. % sign) or else I get the error noted earlier.

    Here's what works (note columns commented out to make it work) :;WITH AllPerformers AS

    (

    SELECT

    ROW_NUMBER() OVER(PARTITION BY RespPerf ORDER BY Predictor DESC) AS ROW,

    --Employee#,

    [Name] AS EmployeeName,

    RespPerf,

    CAST(Predictor As varchar(3)) as Predictor_1

    FROM @INTER

    )

    --SELECT * FROM AllPerformers

    SELECT

    --MAX(CASE WHEN RespPerf='H' THEN Employee# ELSE ' ' END) AS HighPerf

    MAX(CASE WHEN RespPerf='H' THEN EmployeeName ELSE '' END) AS HighPerformer

    ,MAX(CASE WHEN RespPerf='H' THEN Predictor_1 ELSE ' 'END) AS HPredictor

    --,MAX(CASE WHEN RespPerf='M' THEN Employee# ELSE ' ' END) AS ModPerf

    ,MAX(CASE WHEN RespPerf='M' THEN EmployeeName ELSE ' ' END) AS ModeratePerformer

    ,MAX(CASE WHEN RespPerf='M' THEN Predictor_1 ELSE ' ' END) AS MPredictor

    --,MAX(CASE WHEN RespPerf='L' THEN Employee# ELSE ' ' END) AS LOWPERF

    ,MAX(CASE WHEN RespPerf='L' THEN EmployeeName ELSE ' ' END) AS MarginalPerformer

    ,MAX(CASE WHEN RespPerf='L' THEN Predictor_1 ELSE ' ' END) AS LPredictor

    FROM AllPerformers

    GROUP BY ROW

  • I haven't seen a scenario where the MAX() function on character values did not work.

    Would you mind posting some of the data that will fail in your scenario?

    Based on the error message you posted it looks like you're facing a type conversion error. All parts of a CASE statement will be converted to the data type with the highest precedence.

    Here's an example:

    DECLARE @v-2 CHAR(1)

    SET @v-2 ='0'

    -- SET @v-2 ='a'

    SELECT CASE WHEN @v-2 = 'a' THEN 'a' ELSE 0 END

    This code will run just fine. But as soon as you set @v-2 to 'a' it will fail with a similar error like you're getting...

    But the statement will run without error if you change the query to

    SELECT CASE WHEN @v-2 = 'a' THEN 'a' ELSE '0' END



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I was preparing some data to demonstrate the conversion error when I discovered the problem, which was not with the MAX statement, but with the CASE statement in the following:

    MAX(CASE WHEN Perf='H' THEN Employee ELSE 0 END) AS HighPerf,

    As your example indicated, the problem is quite simply that since Employee is defined as varchar, and 0 is implictly int, SS attempts to convert Employee to integer to perform the CASE statement, which works fine for empty data or numeric data but for characters in the data gives the error.

    So you were right, all I had to do was change 0 to '0'.

    Thanks to all.

  • PhilM99 (2/27/2010)


    I was preparing some data to demonstrate the conversion error when I discovered the problem, which was not with the MAX statement, but with the CASE statement in the following:

    MAX(CASE WHEN Perf='H' THEN Employee ELSE 0 END) AS HighPerf,

    As your example indicated, the problem is quite simply that since Employee is defined as varchar, and 0 is implictly int, SS attempts to convert Employee to integer to perform the CASE statement, which works fine for empty data or numeric data but for characters in the data gives the error.

    So you were right, all I had to do was change 0 to '0'.

    Thanks to all.

    Glad I could help! 🙂

    But there's one thing that makes me wondering:

    Within your sample data you defined Employee# as int. Therefore I used 0 (zero) instead of '' (blank). Was that a typo on your side?



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Yes, that was an error I made during construction of the sample for posting. Sorry!

    But it led to a useful learning exercise!

    Phil

  • PhilM99 (2/27/2010)


    Yes, that was an error I made during construction of the sample for posting. Sorry!

    But it led to a useful learning exercise!

    Phil

    Great you look at it that way!!

    And nothing to be sorry for. No harm was done. 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply