Correlated Query

  • I have a table called PROJECTITEMISEDGRANT which would look something like this

    ID ProjectPeriodID, ItemisedGrantID, Amount

    01 1660 123 700

    02 1660 456 150

    03 1600 789 230

    04 1661 123 400

    05 1661 456 150

    06 1601 789 230

    07 1663 123 200

    08 1663 456 150

    09 1603 789 230

    SELECT ID, projectPeriodID, ItemisedGrantID, Amount FROM PROJECTITEMISEDGRANT

    SELECT * FROM PROJECTITEMISEDGRANT where itemisedGrantID = 901 and ProjectPeriodID < 1661

    I am trying to construct a query that will allow me to see the Sales amount for each itemedgrantid prior to the record i am on

    ie if I was on record 7, i want to be able to see id, projectPeriodid, sales to date (the sum Amount for the current Itemisedgrant '123' with an id less than 7) then the current amount

    so it would return something like this

    ID ProjectPeriodID, ItemisedGrantID, Amount to date (not including this row) Amount

    id 1663 123 1100 200

    i think It is a corrolated sub-query that I need, but i have been trying for hours and cant seem to get my head round it, I would appreciate any help or pointers.

    Cheers

  • [EDIT]

    This was wrong. I was half asleep when I initially wrote it, and I get lazy when people don't provide DDL / Sample data. The following should work fine and is a set based solution using a derived table. If your data set is extremely small, you may not notice any difference between this and a correlated subquery, but as the data size grows, a correlated subquery is going to have more and more problems.

    [font="Courier New"]CREATE TABLE #temp(

    ID             INT,

    ProjectPeriodID    INT,

    ItemisedGrantID    INT,

    Amount         INT)

    INSERT INTO #temp(ID, ProjectPeriodID, ItemisedGrantID, Amount)

    SELECT 01, 1660,                123,                  700 UNION ALL

    SELECT 02, 1660,                456,                  150 UNION ALL

    SELECT 03, 1600,                789,                  230 UNION ALL

    SELECT 04, 1661,                123,                  400 UNION ALL

    SELECT 05, 1661,                456,                  150 UNION ALL

    SELECT 06, 1601,                789,                  230 UNION ALL

    SELECT 07, 1663,                123,                  200 UNION ALL

    SELECT 08, 1663,                456,                  150 UNION ALL

    SELECT 09, 1603,                789,                  230

    SELECT  P1.ProjectPeriodID, P1.ItemisedGrantID, P2.Total [Amount to Date], P1.Amount

    FROM #temp P1  

       LEFT JOIN (SELECT ItemisedGrantID, SUM(Amount) Total FROM #temp WHERE ID < 7 GROUP BY ItemisedGrantID) P2 ON P1.ItemisedGrantID = P2.ItemisedGrantID

    WHERE P1.ItemisedGrantID = '123' AND P1.ID = 7

    -- OR --

    SELECT  P1.ID, P1.ProjectPeriodID, P1.ItemisedGrantID, P2.Total [Amount to Date], P1.Amount

    FROM #temp P1  

       LEFT JOIN (SELECT ItemisedGrantID, SUM(Amount) Total FROM #temp WHERE ProjectPeriodID < 1663 GROUP BY ItemisedGrantID) P2 ON P1.ItemisedGrantID = P2.ItemisedGrantID

    WHERE P1.ItemisedGrantID = '123' AND P1.ProjectPeriodID = 1663

    [/font]

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • A correlated subquery is the way to go. They can be very useful.

    select

    ID

    , ProjectPeriodID

    , ItemisedGrantID

    , Amount

    , ISNULL (

    (

    -- This does the correlated sub query

    SELECT sum (amount)

    from ProjectPeriod c

    where c.ItemisedGrantID = p.ItemisedGrantID

    --This gets lower ids only

    and c.id < p.id

    )

    ,0)

    from ProjectPeriod p

  • Jim Nesbitt (10/24/2008)


    A correlated subquery is the way to go. They can be very useful.

    select

    ID

    , ProjectPeriodID

    , ItemisedGrantID

    , Amount

    , ISNULL (

    (

    -- This does the correlated sub query

    SELECT sum (amount)

    from ProjectPeriod c

    where c.ItemisedGrantID = p.ItemisedGrantID

    --This gets lower ids only

    and c.id < p.id

    )

    ,0)

    from ProjectPeriod p

    Correlated subqueries with inequalities can be a form of "Death by SQL". See the following for why...

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    Here's an article on a very high speed alternative for doing grouped running totals...

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The article referred to by Jeff is extremely interesting, and I will definitely make use of this technique.

    However, I don't think a subquery solution is always bad. It would very much depend on how often the query was required and also how often the table was updated.

    If the information was accessed occasionally for specific ids or periods, then the sub query wouldn't read the entire table and response should be fine. So no Death by SQL.

    The alternate solution requires a permanent field and an update to the entire table. It would also be necessary to update the totals each time a row was changed or modified, and the development effort is higher compared to using the correlated subquery, so this also has to be considered.

    I agree that a correlated subquery could be a dog if the information is required for all rows of a very large table, and I'm now hoping I'll get a chance to use the alternative some time - thanks for the heads up.

  • The alternate solution requires a permanent field and an update to the entire table. It would also be necessary to update the totals each time a row was changed or modified, and the development effort is higher compared to using the correlated subquery, so this also has to be considered.

    Using a derived tables/cte's instead of the correlated subquery gets you the best of both worlds. See my above example. The difference is that a derived table is evaluated once, where as a correlated subquery is evaluated once... per row. That said, if she is only running one ID at a time, the difference is going to be negligible.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • The problem with even allowing such code to exist is that people will take any shortcut they can without regard to future performance. If they run across such code, they are likely to copy it for use in an application that will scale out a lot more and BOOM! Death by SQL.

    If you want people to stop writing performance limited, non-scalable code in your database, stop giving them examples of bad code in your database to copy from. 😉 Justifying bad code by saying it's for a "limited number of rows" isn't the right thing to do... ever. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply