Joining and Pivoting Data Question

  • Anyone able to help someone who is struggling to figure out how correctly join and pivot data from two separate tables...

    I have two tables - one containing information about people and the dates they visited our site:

    PersonID |FName |LName |Visit1 |Visit2 |Visit3

    ====================================

    1111 |Fred |Bloggs |11.01.13 |11.02.13 |11.03.13

    1112 |James |Bond |12.01.13 |12.02.13 |12.03.13

    I also have another table Containing the Information about questionnaires they were given to complete at each visit and the date they were completed:

    PersonID |Visit |Questionnaire |Completion Time

    ================================

    1111 |1 |1 |15.01.13 11:00

    1111 |1 |2 |16.01.13 12:10

    1111 |1 |3 |25.01.13 09:01

    1111 |2 |1 |16.02.13 12:15

    1111 |2 |2 |20.02.13 15:45

    1111 |2 |3 |22.02.13 17:22

    1111 |3 |1 |NULL

    1111 |3 |2 |14.03.13 18:59

    1111 |3 |3 |20.03.13 20:42

    1112 |1 |1 |17.01.13 12:45

    1112 |1 |2 |18.01.13 07:02

    1112 |1 |3 |19.01.13 09:00

    1112 |2 |1 |22.02.13 17:00

    1112 |2 |2 |25.02.13 19:10

    1112 |2 |3 |27.01.13 15:48

    I am trying to join these tables and report on the questionnaires answered in each visit in the following format:

    PersonID |FName |LName |Visit |Visit Date |Q1 Time |Q2 Time |Q3 Time

    ========================================================

    1111 |Fred |Bloggs |1 |11.01.13 |15.01.13 11:00 |16.01.13 12:10 |25.01.13 09:01

    1111 |Fred |Bloggs |2 |11.02.13 |16.02.13 12:15 |20.02.13 15:45 |22.02.13 17:22

    1111 |Fred |Bloggs |3 |11.03.13 |NULL |14.03.13 18:59 |20.03.13 20:42

    1112 |James |Bond |1 |12.01.13 |17.01.13 12:45 |18.01.13 07:02 |19.01.13 09:00

    1112 |James |Bond |2 |12.02.13 |22.02.13 17:00 |25.02.13 19:10 |27.01.13 15:48

    1113 |James |Bond |3 |12.03.13 |NULL |NULL |NULL

    Is anyone able to help with this as I seem to be getting nowhere as I am not sure which approach would actually work here. Sorry I am trying to learn but this seems to have stumped me. Any help would be greatly appreciated!

  • Having a quick look at your table, they don't appear to be in 3rd normal form. Is a visitor limited to only 3 visits? What happens if they visit a 4th time?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Sorry my bad, the first table is just a temp table I have created listing the people and their visit dates - they are limited to 3 visits in this case.

  • Thanks. The fact the it's not actually 3rd normal was grating on me, so i've changed my tables a little for the moment (I can change in the query though). When posting questions like this, it's really useful to supply create and insert statements. If anyone wants to make a start while I'm working, feel free to use my creates:

    Create table #Person (PersonID int,

    FName varchar(20),

    LName varchar(20))

    Create table #Visit (VisitID int,

    PersonID int,

    VisitNumber int,

    VisitDateTime datetime);

    Create table #Answer ( AnswerID int identity(1,1),

    PersonID int,

    Visit int,

    Questionnaire int,

    CompletetionTime datetime);

    Insert into #Person (PersonID,

    FName,

    LName)

    Values (1111, 'Fred', 'Bloggs'),

    (1112, 'James', 'Bond');

    Insert into #Visit (VisitID,

    PersonID,

    VisitNumber,

    VisitDateTime)

    Values (1, 1111, 1, '11-Jan-2013'),

    (2, 1111, 2, '11-Feb-2013'),

    (3, 1111, 3, '11-Mar-2013'),

    (4, 1112, 1, '12-Jan-2013'),

    (5, 1112, 2, '12-Feb-2013'),

    (6, 1112, 3, '12-Mar-2013');

    Insert into #Answer (PersonID,

    Visit,

    Questionnaire,

    CompletetionTime)

    Values (1111, 1, 1, '15-Jan-2013 11:00:00.000'),

    (1111, 1, 2, '16-Jan-2013 12:10:00.000'),

    (1111, 1, 3, '25-Jan-2013 09:01:00.000'),

    (1111, 2, 1, '16-Feb-2013 12:15:00.000'),

    (1111, 2, 2, '20-Feb-2013 15:45:00.000'),

    (1111, 2, 3, '22-Feb-2013 17:22:00.000'),

    (1111, 3, 1, NULL),

    (1111, 3, 2, '14-Mar-2013 18:59:00.000'),

    (1111, 2, 3, '20-Mar-2013 20:42:00.000'),

    (1112, 1, 1, '17-Jan-2013 12:45:00.000'),

    (1112, 1, 2, '18-Jan-2013 07:02:00.000'),

    (1112, 1, 3, '19-Jan-2013 09:00:00.000'),

    (1112, 2, 1, '22-feb-2013 17:00:00.000'),

    (1112, 2, 2, '25-Feb-2013 19:10:00.000'),

    (1112, 2, 3, '27-Jan-2013 15:48:00.000')

    Select P.PersonID,

    P.FName,

    P.LName,

    V1.VisitDateTime as Visit1,

    V2.VisitDateTime as Visit2,

    V3.VisitDateTime as Visit3

    from #Person P

    left join #Visit V1 on P.PersonID = V1.PersonID and V1.VisitNumber = 1

    left join #Visit V2 on P.PersonID = V2.PersonID and V2.VisitNumber = 2

    left join #Visit V3 on P.PersonID = V3.PersonID and V3.VisitNumber = 3

    Select PersonID,

    Visit,

    Questionnaire,

    CompletetionTime

    from #Answer

    Drop table #Person

    Drop Table #Visit

    Drop table #Answer

    Edit: Added Select statement to give the above table format. Will make another post with further answer.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Design issues aside when you post these sorts of questions you should post the SQL to create the sample data - I've done it here, I've changed the dates slightly - however the datatypes aren't really important to the queries if yours are stored differently.

    CREATE TABLE #people (PersonID INT, FName VARCHAR(255), LName VARCHAR(255), Visit1 DATETIME, Visit2 DATETIME, Visit3 DATETIME);

    INSERT INTO #people

    VALUES

    (1111 ,'Fred' ,'Bloggs' ,'11-01-2013','11-02-2013' ,'11-03-2013'),

    (1112 ,'James' ,'Bond' ,'12-01-2013' ,'12-02-2013' ,'12-03-2013')

    CREATE TABLE #results (

    PersonID INT,

    Visit INT,

    Questionnaire INT,

    [Completion Time] VARCHAR(255)

    )

    INSERT INTO #results VALUES

    (1111 ,1 ,1 ,'15-01-2013 11:00'),

    (1111 ,1 ,2 ,'16-01-2013 12:10'),

    (1111 ,1 ,3 ,'25-01-2013 09:01'),

    (1111 ,2 ,1 ,'16-02-2013 12:15'),

    (1111 ,2 ,2 ,'20-02-2013 15:45'),

    (1111 ,2 ,3 ,'22-02-2013 17:22'),

    (1111 ,3 ,1 ,NULL),

    (1111 ,3 ,2 ,'14-03-2013 18:59'),

    (1111 ,3 ,3 ,'20-03-2013 20:42'),

    (1112 ,1 ,1 ,'17-01-2013 12:45'),

    (1112 ,1 ,2 ,'18-01-2013 07:02'),

    (1112 ,1 ,3 ,'19-01-2013 09:00'),

    (1112 ,2 ,1 ,'22-02-2013 17:00'),

    (1112 ,2 ,2 ,'25-02-2013 19:10'),

    (1112 ,2 ,3 ,'27-01-2013 15:48')

    What you are looking to do here is to UNPIVOT the people table and to PIVOT the results then join the two.

    The syntax would look like the following -

    WITH results AS (

    SELECT *

    FROM

    #results

    PIVOT

    (

    MAX([Completion Time]) FOR [Questionnaire] IN ([1],[2],[3])

    ) as pvt

    ), visits AS (

    SELECT

    *,

    TRY_PARSE(RIGHT(Visit,1) AS INT) AS VisitID

    FROM

    #people

    UNPIVOT

    (

    [visit_date] FOR [Visit] IN ([Visit1] , [Visit2] , [Visit3] )

    ) AS unpvt

    )

    SELECT

    *

    FROM

    visits v

    LEFT JOIN results r

    ON v.PersonID = r.PersonID AND v.VisitID = r.Visit

    I've put the two parts into a CTE so you can join them, but you can run each bit separately to understand what it's doing. The VisitID is just an expression for you to be able to join the two result sets on.

  • Seems Spiff beet me to it before I finished!

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thank you so much, you have dug me out of a whole.

    I will bear in mind the scripts to create the dataset in future as well.

    Cheers!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply