summing data from two tables

  • My database has two tables. I need to sum two particular fields from each table where there are a number of conditions.

    That is I need to sum MOUs for all of a particular type of MSC and ECP.

    For example, there is an MSC in one table, DataNortel, and an ECP in another table, DataLucent.

    So for a particular day or days, I need to sum the MOUs for both entities. I’ve been working for several days on trying to come up with a statement.

    There should be the same hours and days for both entities so I just put one of the date fields in the select date part of the statement.

    In this statement I’m only trying to sum MSC 8 and ECP 2 - minutes of use.

    SELECT DataNortel.Date AS Date, (SUM(DataNortel.MOUs) + SUM(DataLucent.MOUs)) AS MOUs

    FROM DataNortel,

    DataLucent

    WHERE(DataNortel.Date >= '10/22/02') AND (DataNortel.Date < '10/23/02') AND (DataNortel.SID = 4384) AND (DataNortel.MSC = 8) AND (DataLucent.ECP = 2) AND (DataLucent.Date >= '10/22/02') AND (DataLucent.Date < '10/23/02') AND (DataLucent.SID = 4376)

    GROUP BY DataNortel.Date

    My various statements always return one of two conditions: either nothing or everything (all entities MOUs summed in some kind of cartesian join).

    Any suggestions would be much appreciated.

  • I'd use a different join

    from DataNortel dn

    inner join DataLucent dl

    on dl.ecp = 2

    and dl.sid=4376

    and dn.msc = 8

    and dn.sid=4384

    where dn.date ...

    Try that. Can you get the correct sums separately?

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones

  • Yes, thanks, that basically works.

    One problem though:

    Using

    SELECT DISTINCT dn.[Date] AS Date, SUM(dn.MOUs) + SUM(dl.MOUs) AS MOUs, (100 * (SUM(dn.TotBlks) + SUM(dl.TotBlks))) / (SUM dn.TotalAtt) + SUM(dl.TotalAtt)) AS TotBlksPct, (100 * (SUM(dn.TotalDrops) + SUM(dl.TotalDrops))) / (SUM(dn.EstCalls) + SUM(dl.EstCalls)) AS TotalDropsPct FROM DataNortel dn INNER JOIN DataLucent dl ON dn.MSC = 8 OR dn.MSC = 34 AND dn.SID = 4384 AND dl.ECP = 6 AND dl.[Date] = '11/05/02'

    WHERE (dn.[Date] >= '11/05/02') AND (dn.[Date] < '11/06/02')GROUP BY dn.[Date]

    The problem is that is sums dl.ECP twice. So the output sums dn.msc 34 + dl.ecp =6 + dn.msc 8 + dl.ecp =6.

    dl.ecp = 6 MOUs are summed twice.

    If I replace the OR with an AND then nothing is returned. How do I get around this?

    quote:


    I'd use a different join

    from DataNortel dn

    inner join DataLucent dl

    on dl.ecp = 2

    and dl.sid=4376

    and dn.msc = 8

    and dn.sid=4384

    where dn.date ...

    Try that. Can you get the correct sums separately?

    Steve Jones

    sjones@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/sjones


  • Backup. Your first query would result in a cartesian table as a result of a full join. You have no information explaining how the tables themselves relate, this is why the results of the INNER JOIN would be off as well.

    What field is common to DataNortel and DataLucent that would be the basis for the join.

    Once you decied that then you can build you query sorta like this

    SELECT .....

    FROM

    DataNortel DN

    INNER JOIN

    DataLucient DL

    ON

    DN.commonfield = DL.commonfield --The field on how they relate

    WHERE

    ....

    GROUP BY

    ....

  • Using the statement in this way still results in the cartesian problem. I added statments to sum the ecp and msc numbers for troubleshooting and with this statement the mscs add to each other (34 + 8 = 42) but the ECP adds to itself twice (6+6=12).

    SELECT dn.[Date] AS Date, SUM(dn.MOUs) + SUM(dl.MOUs) AS MOUs, (100 * (SUM(dn.TotBlks) + SUM(dl.TotBlks))) / (SUM(dn.TotalAtt) + SUM(dl.TotalAtt)) AS TotBlksPct, (100 * (SUM(dn.TotalDrops) + SUM(dl.TotalDrops))) / (SUM(dn.EstCalls) + SUM(dl.EstCalls)) AS TotalDropsPct, SUM(dn.MSC)AS SUMMSC, SUM(dl.ECP) AS sumecp

    FROM DataNortel dn INNER JOIN DataLucent dl ON dl.[Date] = dn.[Date]

    WHERE (dn.[Date] >= '11/05/02') AND (dn.[Date] < '11/06/02') AND (dn.MSC = 8 OR

    dn.MSC = 34) AND (dl.ECP = 6)

    GROUP BY dn.[Date]

  • Well, I figured out one way to do it. Divide the second return by two, sum it and subtract from the total.

    I can't help but think there's a better way. ..

    SELECT dn.[Date], SUM(dn.MOUs) + SUM(dl.MOUs) - SUM(dl.MOUs / 2) AS div2

    FROM DataNortel dn INNER JOIN

    DataLucent dl ON dn.[Date] = dl.[Date]

    WHERE(dn.[Date] >= '11/05/02') AND (dn.[Date] < '11/06/02') AND (dl.ECP = 6)

    GROUP BY dn.[Date]

  • Hey wireless, I got your email, I don't think the problem is in the queries directly, I believe your missing a key to the join. You must have multiple records for the dates and it is a matter of find the right key to join. Can you post the DDL (Create Table Stmt) for the table to give us a better idea of what we are working against?

  • I've simplified the issue down to a couple of tables:

    CREATE TABLE dbo.Dc(c datetime, c1 int, c2 int)

    INSERT

    INTO dbo.Dc (c, c1, c2)

    VALUES ('11 / 01 / 02', 1, 3)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 02 / 02', 1, 4)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 03 / 02', 1, 2)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 04 / 02', 1, 5)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 01 / 02', 2, 2)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 02 / 02', 2, 6)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 03 / 02', 2, 4)

    INSERT

    INTO dbo.Dc(c, c1, c2)

    VALUES ('11 / 04 / 02', 2, 5)

    continued...

  • This is the second table and select statement:

    CREATE TABLE dbo.Dc2(c datetime, c1 int, c2 int)

    INSERT

    INTO dbo.Dc2 (c, c1, c2)

    VALUES ('11 / 01 / 02', 12, 3)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 02 / 02', 12, 4)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 03 / 02', 12, 2)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 04 / 02', 12, 5)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 01 / 02', 4, 2)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 02 / 02', 4, 6)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 03 / 02', 4, 4)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 04 / 02', 4, 5)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 01 / 02', 6, 2)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 02 / 02', 6, 6)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 03 / 02', 6, 4)

    INSERT

    INTO dbo.Dc2(c, c1, c2)

    VALUES ('11 / 04 / 02', 6, 5)

    SELECT *

    FROM dbo.Dc2

    SELECT Dc.c, SUM(Dc.c2) + SUM(Dc2.c2) - SUM(Dc2.c2 / 2) AS c2

    FROM Dc INNER JOIN

    Dc2 ON Dc.c = Dc2.c

    WHERE (Dc.c1 = 1 OR

    Dc.c1 = 2) AND (Dc.c = '11/03/2002') AND (Dc2.c1

    = 4)

    GROUP BY Dc.c

    continued...

  • The above should create these tables:

    Dc1

    cc1c2

    11/0113

    11/0214

    11/0312

    11/0415

    11/0122

    11/0226

    11/0324

    11/0425

    Dc2

    cc1c2

    11/01123

    11/02124

    11/03122

    11/04125

    11/0142

    11/0246

    11/0344

    11/0445

    11/0162

    11/0266

    11/0364

    11/0465

    The select statement produces 10.

    You see how I'm subtracting off the second result from Dc2 because it

    is joined twice with the values in Dc.

    Would a union be better to use on this instead of the inner join?

    The only advantage I can see is, I think, that the union would return

    data whether one table or the other were missing data. That is, the inner join equivalent condition Dc.c = Dc2.c would not need to be satisfied to return values.

    -David

  • Would something like the following statement work in this case…

    SELECT fields, SUM(fields)

    FROM

    (SELECT fields FROM table_1 WHERE condition

    UNION ALL

    SELECT fields FROM table_2 WHERE condition)

    GROUP BY fields

  • From the example, it seems like the dates have no time included.

    If you are just interested in a single date:

    declare @NortelTot INTEGER, @LucentTot INTEGER (check datatype)

    SELECT @NortelTot = SUM(MOUs)

    FROM DataNortel (NOLOCK)

    WHERE(Date >= '10/22/02') AND (Date < '10/23/02') AND (SID = 4384) AND (MSC = 8)

    SELECT @LucentTot = SUM(MOUs)

    FROM DataLucent (NOLOCK)

    WHERE (ECP = 2) AND (Date >= '10/22/02') AND (Date < '10/23/02') AND (SID = 4376)

    SELECT @NortelTot + @LucentTot 'TablesTotal'

    For multiple dates, I would probably create a temporary table

    CREATE TABLE #dateTotals (

    TotDate SMALLDATETIME,

    DayTot INTEGER

    )

    and insert from each table

    INSERT INTO #dateTotals

    SELECT Date, SUM(MOUs)

    FROM DataNortel (NOLOCK)

    WHERE(Date >= '10/22/02') AND (Date < '10/23/02') AND (SID = 4384) AND (MSC = 8)

    GROUP BY DATE

    INSERT INTO #dateTotals

    SELECT Date, SUM(MOUs)

    FROM DataLucent (NOLOCK)

    WHERE (ECP = 2) AND (Date >= '10/22/02') AND (Date < '10/23/02') AND (SID = 4376)

    GROUP BY Date

    SELECT TotDate, SUM( DayTot ) 'DayTotal'

    FROM #dateTotals

    GROUP BY TotDate

    Obviously, there are ramifications if times are stored with the dates and also if the output crossed extreme time ranges...OR if there is a real key that joins the tables.

  • You are on the right track. Here is general idea of the solution I implemented.

    This was provided by Erland Sommarskog and I'll just quote his answer:

    " declare @d datetime

    set @d = '20021103'

    select c, sum(Hours)

    from (

    select c, sum(c2) as Hours from Dc

    where Dc.c = @d

    and Dc.c1 IN (1, 2)

    group by Dc.c

    union all

    select c, sum(c2)

    from Dc2

    where Dc2.c = @d

    and Dc2.c1 = 4

    group by Dc2.c

    ) T

    group by c

    Not that we need to use a derived table to sum the values from the

    involved tables. Note also that we must use UNION ALL, as UNION alone

    discards duplicates, so we would get incorrect result if the

    two tables yielded the same sum."

    quote:


    Would something like the following statement work in this case…

    SELECT fields, SUM(fields)

    FROM

    (SELECT fields FROM table_1 WHERE condition

    UNION ALL

    SELECT fields FROM table_2 WHERE condition)

    GROUP BY fields


Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply