Can I improve this procedure with a join?

  • Hi All,

    I have two tables, one which defines reporting periods (with a start and end date for each period), and one with production dates and volumes. I want to output a table which sums the production volumes within each reporting period.

    As per the code below, at the moment I create an intermediate table which associates a report date with the production volume using a subquery, and them I use the SUM function on this intermediate table to get the data that I want. This two step process seems like one step too many, and I thought it could be done in one step using a JOIN, but I can't work it out. My understanding of a join is that it matches values between different tables, but I want to match a value (the production date) to a range of values (the start and end reporting period dates). So is there a way to get the result I want in a single query? I guess the other consideration is that at the moment the two queries are potentially more readable than one large complicated query. I'm all for efficiency and elegance in code, but I also like to balance that with readability and maintainability.

    --table to hold sample reporting period dates.

    DECLARE @reportPeriods TABLE(startDate datetime, endDate datetime)

    INSERT INTO @reportPeriods VALUES('20090101','20090630')

    INSERT INTO @reportPeriods VALUES('20090701','20091231')

    INSERT INTO @reportPeriods VALUES('20100101','20100630')

    --table to hold sample production dates and values.

    DECLARE @productionData TABLE(productionDate datetime, productionVolume int)

    INSERT INTO @productionData VALUES('20090105',100)

    INSERT INTO @productionData VALUES('20090201',150)

    INSERT INTO @productionData VALUES('20090301',100)

    INSERT INTO @productionData VALUES('20090410',100)

    INSERT INTO @productionData VALUES('20090801',100)

    INSERT INTO @productionData VALUES('20091008',150)

    INSERT INTO @productionData VALUES('20100108',120)

    INSERT INTO @productionData VALUES('20100201',50)

    INSERT INTO @productionData VALUES('20100402',80)

    INSERT INTO @productionData VALUES('20100501',100)

    INSERT INTO @productionData VALUES('20100602',80)

    INSERT INTO @productionData VALUES('20100610',100)

    --Step 1: create a table that determines which reporting period the production date falls in,

    -- and associate the production volume with that reporting period

    DECLARE @productionWithReportingDates TABLE (reportPeriodStartDate datetime, productionVolume float)

    INSERT INTO @productionWithReportingDates

    SELECT (SELECT StartDate

    FROM @reportPeriods

    WHERE productionDate >= StartDate AND

    productionDate < EndDate),

    productionVolume

    FROM @productionData

    SELECT * FROM @productionWithReportingDates

    --Step 2: sum the production values within each reporting period

    DECLARE @productionTotals TABLE(startDate datetime, totalVolume int)

    INSERT INTO @productionTotals

    SELECT min(reportPeriodStartDate), sum(productionVolume)

    FROM @productionWithReportingDates

    GROUP BY reportPeriodStartDate

    ORDER BY min(reportPeriodStartDate)

    SELECT * FROM @productionTotals

  • Hi,

    try this statement. I joint both tables on the dates.

    SELECT startdate, enddate, SUM(productionVolume) AS Volume

    FROM @reportPeriods

    INNER JOIN @productionData ON startdate <= productionDate AND enddate >= productionDate

    GROUP BY startdate, enddate

    Regards,

    Thomas

  • Slightly different to Thomas' solution:

    SELECT rp.StartDate, SUM(productionVolume) AS totalVolume

    FROM @productionData pd

    INNER JOIN @reportPeriods rp

    ON pd.productionDate >= rp.StartDate

    AND pd.productionDate < rp.EndDate

    GROUP BY rp.StartDate

    ORDER BY rp.StartDate

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thomas and Chris,

    Thanks for your replies. I didn't know you could use inequalities and logical operators in a join statement, although now that I think about it some more it seems perfectly obvious.

    John.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply