How do I show row data as a single field?

  • I'm sure I've seen posts similar to this before, but can't seem to find any now!

    What I'm trying to do is easily represented in two tables:

    Here's the starting data:

    pk          fk          dat        
    
    ----------- ----------- ----------
    301 1001 Greg
    302 1001 Jeff
    303 1001 Mike
    304 1002 Fred
    305 1002 Barb
    306 1002 Andy
    307 1003 Chad
    308 1004 Tina
    309 1004 Tony

    What I'd like my sql statement to end up with is:

    fk          combined_dat
    
    ----------- ----------------
    1001 Greg, Jeff, Mike
    1002 Fred, Barb, Andy
    1003 Chad
    1004 Tina, Tony

    I've done this before using cursors or temporary tables, what I'm wondering is how to do it without either of them? Seems I saw an example several months back that was quite nice and also very efficient compared to the cursor/temp table method.

    This sql will become a subquery joined back to a main query via the FK. I'm using SQL 6.5. Thanks in advance!

    Edited by - GregLyon on 10/01/2003 12:55:40 PM

  • In SQL 6.5, I'd use a cursor unless you know the maximum number of Dats to be concatenated, in which case I'd use CASE with self-joins if that number is less than eight or so.

    If you upgrade to SQL Server 2000, there's a trick you could use with a UDF (user defined function) that makes this very slick and easy.

    In any case, the best answer is to perform denormalization like this in the front-end.

    --Jonathan



    --Jonathan

  • Thanks, Jonathan.

    Currently there are 0-3 possible results per FK, so I'm thinking I'll proceed using the self joins, I just have to figure out how to suppress the 'partial' results now...

  • quote:


    Thanks, Jonathan.


    You're welcome.

    quote:


    Currently there are 0-3 possible results per FK, so I'm thinking I'll proceed using the self joins, I just have to figure out how to suppress the 'partial' results now...


    Making some obvious assumptions, how about something like:

    SELECT b.pk, ISNULL(d1.dat,'') + ISNULL(', ' + d2.dat,'') + ISNULL(', ' + d3.dat,'')

    FROM Base b

    LEFT JOIN Dats d1 ON b.pk = d1.fk AND d1.pk = (SELECT MIN(pk) FROM Dats WHERE fk = b.pk)

    LEFT JOIN Dats d2 ON b.pk = d2.fk AND d2.pk = (SELECT MIN(pk) FROM Dats WHERE fk = b.pk AND pk > d1.pk)

    LEFT JOIN Dats d3 ON b.pk = d3.fk AND d3.pk = (SELECT MIN(pk) FROM Dats WHERE fk = b.pk AND pk > d2.pk)

    --Jonathan



    --Jonathan

  • SELECTa.fk,a.dat+ISNULL(', '+b.dat,'')+ISNULL(', '+c.dat,'') as combined_dat 
    
    FROM(SELECT fk,MIN(pk) as pk FROM tablea GROUP BY fk) x
    LEFT OUTER JOIN (SELECT fk,MAX(pk) as pk FROM tablea GROUP BY fk) y
    ON y.fk = x.fk AND y.pk <> x.pk
    LEFT OUTER JOIN tablea b
    ON b.fk = x.fk AND b.pk <> x.pk AND b.pk <> y.pk
    LEFT OUTER JOIN tablea c
    ON c.pk = y.pk
    INNER JOIN tablea a
    ON a.pk = x.pk


    Edited by - davidburrows on 10/02/2003 08:07:37 AM

    Far away is close at hand in the images of elsewhere.
    Anon.

  • quote:


    SELECTa.fk,a.dat+ISNULL(', '+b.dat,'')+ISNULL(', '+c.dat,'') as combined_dat 
    
    FROM(SELECT fk,MIN(pk) as pk FROM tablea GROUP BY fk) x
    LEFT OUTER JOIN (SELECT fk,MAX(pk) as pk FROM tablea GROUP BY fk) y
    ON y.fk = x.fk AND y.pk <> x.pk
    LEFT OUTER JOIN tablea b
    ON b.fk = x.fk AND b.pk <> x.pk AND b.pk <> y.pk
    LEFT OUTER JOIN tablea c
    ON c.pk = y.pk
    INNER JOIN tablea a
    ON a.pk = x.pk



    I had posted a similar solution only to delete it after I noticed Greg wrote "0-3 possible results," so I rewrote using the "base" table...

    Your code will be faster if you avoid "not"s in the predicates, i.e.,

    SELECTa.fk,a.dat+ISNULL(', '+b.dat,'')+ISNULL(', '+c.dat,'') as combined_dat 
    
    FROM(SELECT fk,MIN(pk) as pk FROM tablea GROUP BY fk) x
    LEFT OUTER JOIN (SELECT fk,MAX(pk) as pk FROM tablea GROUP BY fk) y
    ON y.fk = x.fk AND y.pk > x.pk
    LEFT OUTER JOIN tablea b
    ON b.fk = x.fk AND b.pk > x.pk AND b.pk < y.pk
    LEFT OUTER JOIN tablea c
    ON c.pk = y.pk
    INNER JOIN tablea a
    ON a.pk = x.pk

    --Jonathan



    --Jonathan

  • I see now that I was close to it, but missed the MIN(pk) part so I was getting multiple results per FK, each with one less entry. Here's my 'bad' sql:

    
    
    select t.fk
    , t.dat
    + case when t1.dat is null then '' else ', ' + t1.dat END
    + case when t2.dat is null then '' else ', ' + t2.dat END
    + case when t3.dat is null then '' else ', ' + t3.dat END AS cdat
    FROM #test t
    LEFT OUTER JOIN (SELECT fk, pk, dat FROM #test) t1 on t.fk = t1.fk and t.pk < t1.pk
    LEFT OUTER JOIN (SELECT fk, pk, dat FROM #test) t2 on t1.fk = t2.fk and t1.pk < t2.pk
    LEFT OUTER JOIN (SELECT fk, pk, dat FROM #test) t3 on t2.fk = t3.fk and t2.pk < t3.pk

    Which gave these results:

    
    
    fk cdat
    ----------- ----------------------------------------------
    1001 Greg, Jeff, Mike
    1001 Greg, Mike
    1001 Jeff, Mike
    1001 Mike
    1002 Fred, Barb, Andy
    1002 Fred, Andy
    1002 Barb, Andy
    1002 Andy
    1003 Chad
    1004 Tina, Tony
    1004 Tony

    Obviously NOT what I was looking for!

    I see that this method requires as many left outer joins as the max number of records that a single Foriegn key would have, and I assume that's why, Jonathan, you said it's feasible up to 8 or so. I think I'll allow for a couple extra values in case my data varies in the future.

    Thank you's to both of you, Jonathan and David.

  • I thought I'd make the SQL handle MORE than 3 cases in case data changes in the future. It looks like as long as you know how many the max possible is you just keep adding left outer joins. I suppose some dynamic SQL could check for max cases and then build the appropriate statement, but That would probably be overkill in this instance, and I suppose it'd get slow if there were LOTS of foreign key values.

    Here's the one I'll use, which will allow for 5 values per FK:

    
    
    SELECT
    b.pk
    , ISNULL(d1.dat,'') + ISNULL(', ' + d2.dat,'')
    + ISNULL(', ' + d3.dat,'') + ISNULL(', ' + d4.dat,'')
    + ISNULL(', ' + d5.dat,'')
    FROM #base b
    LEFT OUTER JOIN #test d1 ON b.pk = d1.fk
    AND d1.pk = (SELECT MIN(pk) FROM #test WHERE fk = b.pk)
    LEFT OUTER JOIN #test d2 ON b.pk = d2.fk
    AND d2.pk = (SELECT MIN(pk) FROM #test WHERE fk = b.pk AND pk > d1.pk)
    LEFT OUTER JOIN #test d3 ON b.pk = d3.fk
    AND d3.pk = (SELECT MIN(pk) FROM #test WHERE fk = b.pk AND pk > d2.pk)
    LEFT OUTER JOIN #test d4 ON b.pk = d4.fk
    AND d4.pk = (SELECT MIN(pk) FROM #test WHERE fk = b.pk AND pk > d3.pk)
    LEFT OUTER JOIN #test d5 ON b.pk = d5.fk
    AND d5.pk = (SELECT MIN(pk) FROM #test WHERE fk = b.pk AND pk > d4.pk)

    Once again, thanks to both of you for the help!

    Edited by - GregLyon on 10/02/2003 09:10:22 AM

  • quote:


    I see that this method requires as many left outer joins as the max number of records that a single Foriegn key would have, and I assume that's why, Jonathan, you said it's feasible up to 8 or so. I think I'll allow for a couple extra values in case my data varies in the future.


    Exactly right. I wrote some demo code a few years ago that determined this number and then built the query using dynamic SQL to have the minimum number of self joins. But with SQL Server 6.5 that would be asking for trouble, both because of its relative inefficiency and its limit on character parameter length.

    --Jonathan



    --Jonathan

  • quote:


    ...and its limit on character parameter length.


    Yes, I ran into the parameter length issue with some particularly sticky dynamic sql I wrote a couple months ago. I figured out that the sql statement could easily run over 8000 characters (up to 20000 or so actually) and ended up storing different pieces of it in different parameters and then concatenating them at the exec line. It actually works pretty well, but then it never returns over a couple hundred rows, and the source tables are quite small by Sql server standards.

  • Here's a way to do it without a cursor or temptable and no joins, you still have the issue of the varchar(8000). Just another thought.

    DECLARE @I INT

    DECLARE @MaxCount INT

    SET @MaxCount = (SELECT MAX(SID) FROM Table1)

    PRINT @MaxCount

    SET @I = 1

    DECLARE @TextString VARCHAR(8000)

    while @I <= @MaxCount

    BEGIN

    --DECLARE @TextString VARCHAR(8000)

    SET @TextString = ISNULL(@TextString,'') + (SELECT ISNULL(SText,'') + ', ' FROM Table1 WHERE SID = @I)

    --Select @TextString

    SET @I = @I + 1

    Continue

    END

    Select LEFT(@TextString, LEN(@TextString)-1) AS TextString

    Edward M. Sokolove

    Edited by - sokolove on 10/02/2003 11:56:33 AM


    Edward M. Sokolove

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply