Challenge for SQL Gurus: Grouping and outer joins

  • Here's one challenge for you gurus of the SQL language.

    First, here are the definitions and data

    --------------------------------------------------

    declare

    @xref table

    (

    xref int,

    name nvarchar(32)

    )

    declare

    @data table

    (

    group1 nvarchar(32),

    group2 nvarchar(32),

    xref int,

    value1 int,

    value2 int

    )

    insert into @xref (xref, name) values (1, 'X1')

    insert into @xref (xref, name) values (2, 'X2')

    insert into @xref (xref, name) values (3, 'X3')

    insert into @data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 1, 100, 200)

    insert into @data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 1, 101, 201)

    insert into @data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 2, 102, 202)

    insert into @data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2b', 1, 103, 203)

    insert into @data (group1, group2, xref, value1, value2) values ('Group1b', 'Group2a', 3, 104, 204)

    --------------------------------------------------

    Next, select the data using

    select

    group1, group2, xref, sum(value1), sum(value2)

    from

    @data

    group by

    group1,

    group2,

    xref

    group1 group2 xref value1 value2

    ------ ------- ------- ------- -------

    Group1aGroup2a1201401

    Group1aGroup2a2102202

    Group1aGroup2b1103203

    Group1bGroup2a3104204

    ----------------------------------------

    Nice grouping as you can see.

    Now, the challenge. I want the data returned to include missing xref for the groups along with 0 values for the other columns.

    group1 group2 xref value1 value2

    ------ ------- ------- ------- -------

    Group1aGroup2a1201401

    Group1aGroup2a2102202

    Group1a Group2a 3 0 0 <--- This is not in @data

    Group1aGroup2b1103203

    Group1a Group2b 2 0 0 <---
    Group1a Group2b 3 0 0 <--- Group1b Group2a 1 0 0 <---
    Group1b Group2a 2 0 0 <---
    Group1bGroup2a3104204

    ----------------------------------

    I guess the solution is to use an outer join with the @xref table but I can't seem to produce the desired results.

    Does anyone know of a way to do this in SQL without resorting to temporary tables and cursors?

    Thank you and good luck.

  • How about ?

    select

    group1, group2, xref, sum(value1), sum(value2)

    from

    (select x.Xref, d.group1, d.group2, d.value1, d.value2

     from @x x left join @data d on x.xref = d.xref) Subq

    group by

    group1,

    group2,

    xref

     


    * Noel

  • create table #xref ( xref int, name nvarchar(32))

    create table #data (group1 nvarchar(32),group2 nvarchar(32),xref int,value1 int,value2 int)

    insert into #xref (xref, name) values (1, 'X1')

    insert into #xref (xref, name) values (2, 'X2')

    insert into #xref (xref, name) values (3, 'X3')

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 1, 100, 200)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 1, 101, 201)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 2, 102, 202)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2b', 1, 103, 203)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1b', 'Group2a', 3, 104, 204)

    select Groups.group1 , Groups.Group2 , #xref.xref , #xref.name

    , COALESCE( GroupSum.value1 ,0 ) as Value1 , COALESCE( GroupSum.value2,0 ) as Value2

    from (select distinct group1, group2

    from #data

    ) as Groups (group1, group2)

    cross join #xref

    left outer join

    (select group1, group2, xref, sum(value1), sum(value2)

    from#data

    group by group1, group2, xref

    ) as GroupSum ( group1, group2, xref, value1, value2 )

    on GroupSum.group1= Groups.group1

    and GroupSum.group2= Groups.group2

    and GroupSum.xref= #xref.xref

    SQL = Scarcely Qualifies as a Language

  • OOPS!

    My previous post was wrong! Didn't read the isntructions correctly

     


    * Noel

  • WOW... and I thaught I was starting to know something about t-sql.

    Where did you learn to do that?? (I know it's just a little trick but still it's not that obvious to do).

  • Wow! I am impressed!

    So soon!

    I'll start "digesting" what you submitted.

    Would you be so kind in explaining your solution?

    I knew it was possible to do this. It only took one guru like you to do it!

    Thank you very much!

  • Ah! the grouping problems... always love them.

    I also taught I knew a lot about SQL but there is always something new to learn lurking around the corner.

  • Solution :

    Select the distinct groups

    cross join to the ref so you get all possible combinaisons

    left join that to the calculated sum of your first partial solution

    isnull() the missing columns with 0 and that's your resultset.

  • Unless I am mistaken, Carl is given credit in Joe Celko's "SQL for Smarties".

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • I really got to get that book...

    and read it eventually .

  • Do this! I certainly have learned a lot from it. And I for myself, enjoy Joe's style of writing.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Ya the least we can say is that he has a colorfull style .

  • The real stump the SQL Server guru question is to ask him the correct syntax for a cursor...

    OK, I have to stop now, starting to sound like a broken record.

  • Hi guys,

    Try this one (simplified version):

    select x.group1,x.group2,x.xref,sum(d.value1), sum(d.value2) from

    (select distinct group1,group2,x.xref from @xref x,@data d) as X

    LEFT OUTER JOIN @data d

    on x.[group1] = d.[group1] and x.[group2]=d.group2 and x.xref = d.xref

    group by x.group1,x.group2,x.xref

    order by x.group1,x.group2,x.xref

    Logic is same as (cross join on @xref and @data and a OUTER JOIN with @data).

     


    Kindest Regards,

    Hari

  • Hi guys a quick question on the posts by Carl and Hariharan. First Carl's post uses a cross join without a where statement which will produce a Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. On a large data set this would not seem to be desirable. Hariharan's post would seem to be the better option. I have not had a chance to test either option and will not until much later in the day. (But I will have time to check back to the forum ) so my question is which option is most efficient? I apologize for asking without giving this more thought but it will drive me crazy (a short trip) until I can resolve this and I have to leave for work? I would not mind being unemployed if the pay was not so bad.

     

    Thanks

    Mike

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply