Help with query - why am I getting 5 or 25 times the expected sums?

  • I've anonymized the data and also reduced the amount of fields, but I've included the data that is giving me issues.  I am not a DBA or a programmer.  I am more a business analyst type and this problem is giving me fits.

    I would greatly appreciate it if someone could help me out.

    The expected output is as follows:

    IDNAMEAMOUNTATRC1ATRC2TRC1TRC2AR1AR2TR1TR2
    006i000000itALDAA2ABC Client/EMEA/85000/Q38500095005750019000660009500575001900066000

    The actual output I am getting is:

    IDNAMEAMOUNTATRC1ATRC2TRC1TRC2AR1AR2TR1TR2
    006i000000itALDAA2ABC Client/EMEA/85000/Q3850009500575001900066000237500287500475000330000

    Columns AR1 and TR1 are 25 times the expected value and columns AR2 and TR2 are 5 times the expected value.

    Here is the SQL to create the tables and insert the data:

    CREATE TABLE OPP
    (
      ID nvarchar(255),
      NAME nvarchar(255),
        AMOUNT float,
        ATRC1 float,
        ATRC2 float,
        TRC1 float,
        TRC2 float
    );

    CREATE TABLE OLI
    (
      ID nvarchar(255),
        OPPID nvarchar(255),
      NAME nvarchar(255),
        RSC nvarchar(255),
        OLIAMOUNT float,
        TOTAL_AMOUNT float,
    );

    INSERT INTO OPP (ID, NAME,    AMOUNT, ATRC1, ATRC2, TRC1, TRC2)
    VALUES
    ('006i000000itALDAA2', 'ABC Client/EMEA/85000/Q3', '85000', '9500', '57500', '19000', '66000');

    INSERT INTO OLI (ID, OPPID, NAME, RSC, OLIAMOUNT, TOTAL_AMOUNT)
    VALUES
    ('00ki000000dnf1kAAA', '006i000000itALDAA2', 'Prod1', '1', '9500', '19000'),
    ('00ki000000dnf1pAAA', '006i000000itALDAA2', 'Prod2', '2', '24500',    '24500'),
    ('00ki000000dnf1zAAA', '006i000000itALDAA2', 'Prod3', '2', '14000',    '14000'),
    ('00ki000000dnf1qAAA', '006i000000itALDAA2', 'Prod4', '2', '10500',    '10500'),
    ('00k0H00000fGN4tQAG', '006i000000itALDAA2', 'Prod5', '2', '4250',    '8500'),
    ('00k0H00000fGN4uQAG', '006i000000itALDAA2', 'Prod6', '2', '4250',    '8500');

    and here is my faulty query:


    SELECT OPP.ID,
       OPP.NAME,
       OPP.AMOUNT,
       OPP.ATRC1,
       OPP.ATRC2,
       OPP.TRC1,
       OPP.TRC2,
       SUM(CASE WHEN OLI1a.RSC = '1' THEN OLI1a.OLIAMOUNT ELSE 0 END) AS 'AR1',
       SUM(CASE WHEN OLI1b.RSC = '2' THEN OLI1b.OLIAMOUNT ELSE 0 END) AS 'AR2',
       SUM(CASE WHEN OLI2a.RSC = '1' THEN OLI2a.TOTAL_AMOUNT ELSE 0 END) AS 'TR1',
       SUM(CASE WHEN OLI2b.RSC = '2' THEN OLI2b.TOTAL_AMOUNT ELSE 0 END) AS 'TR2'
    FROM OPP
    LEFT JOIN OLI OLI1a ON OLI1a.OPPID = OPP.ID AND OLI1a.RSC = '1'
    LEFT JOIN OLI OLI1b ON OLI1b.OPPID = OPP.ID AND OLI1b.RSC = '2'
    LEFT JOIN OLI OLI2a ON OLI2a.OPPID = OPP.ID AND OLI2a.RSC = '1'
    LEFT JOIN OLI OLI2b ON OLI2b.OPPID = OPP.ID AND OLI2b.RSC = '2'
    WHERE OPP.ID = '006i000000itALDAA2'
    GROUP BY OPP.ID, OPP.NAME, OPP.AMOUNT, OPP.ATRC1, OPP.ATRC2, OPP.TRC1, OPP.TRC2

    What am I doing wrong?  I've been playing with this for hours and still don't have a solution to what I thought would be simple!

    Thanks so much to anyone that can help.  For a true SQL person, this is probably easy.  Hopefully I have described the problem well and provided all the tools needed for some kind person to help me.

  • What's often helpful to debug something like this is to take the aggregation off, and look at the full resultset.

    SELECT OPP.ID,
     OPP.NAME,
     OPP.AMOUNT,
     OPP.ATRC1,
     OPP.ATRC2,
     OPP.TRC1,
     OPP.TRC2,
     CASE WHEN OLI1a.RSC = '1' THEN OLI1a.OLIAMOUNT ELSE 0 END AS 'AR1',
     CASE WHEN OLI1b.RSC = '2' THEN OLI1b.OLIAMOUNT ELSE 0 END AS 'AR2',
     CASE WHEN OLI2a.RSC = '1' THEN OLI2a.TOTAL_AMOUNT ELSE 0 END AS 'TR1',
     CASE WHEN OLI2b.RSC = '2' THEN OLI2b.TOTAL_AMOUNT ELSE 0 END AS 'TR2'
    FROM OPP
    LEFT JOIN OLI OLI1a ON OLI1a.OPPID = OPP.ID AND OLI1a.RSC = '1'
    LEFT JOIN OLI OLI1b ON OLI1b.OPPID = OPP.ID AND OLI1b.RSC = '2'
    LEFT JOIN OLI OLI2a ON OLI2a.OPPID = OPP.ID AND OLI2a.RSC = '1'
    LEFT JOIN OLI OLI2b ON OLI2b.OPPID = OPP.ID AND OLI2b.RSC = '2'
    WHERE OPP.ID = '006i000000itALDAA2'

    If you do that, you'll notice there are 25 rows in the resultingset, because your second and fourth joins both produce 4 rows, and with no other join critera you've got partial cross-joins and so more rows than you would probably have expected

    Now I can't tell what's wrong with the join, I don't know your system. Given your expected results, I suspect there should be only 4 rows before we aggregate. Is that correct?

    Edit: And what are the 3rd and 4th joins supposed to do? They're identical to the first two joins, so they're just going to create more rows.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • On closer look, it's a case of needing to aggregate before the join, not afterwards.


    SELECT OPP.ID,
       OPP.NAME,
       OPP.AMOUNT,
       OPP.ATRC1,
       OPP.ATRC2,
       OPP.TRC1,
       OPP.TRC2,
       AR1,
       AR2,
       TR1,
       TR2
    FROM  OPP
       LEFT JOIN (SELECT SUM(CASE RSC WHEN 1 THEN OLIAMOUNT ELSE 0 END) AS AR1,
              SUM(CASE RSC WHEN 2 THEN OLIAMOUNT ELSE 0 END) AS AR2,
              SUM(CASE RSC WHEN 1 THEN TOTAL_AMOUNT ELSE 0 END) AS TR1,
              SUM(CASE RSC WHEN 2 THEN TOTAL_AMOUNT ELSE 0 END) AS TR2,
              OPPID
           FROM  OLI
           GROUP BY OPPID
          ) Oli12 ON OPP.ID = Oli12.OppID
    WHERE OPP.ID = '006i000000itALDAA2';

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you soooooooooo much Gail!

    The output is perfect.  I will compare to what I did wrong and then recreate my original query in the replicated database I have access to and this will allow me to identify any data issues caused by new production code.

    I learned something valuable today thanks to you!

    ***********************************************************
    Edit: Further note.

    I have successfully executed the full test SQL against production data, so again, thanks so much.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply