October 23, 2008 at 6:08 am
I have a table called PROJECTITEMISEDGRANT which would look something like this
ID ProjectPeriodID, ItemisedGrantID, Amount
01 1660 123 700
02 1660 456 150
03 1600 789 230
04 1661 123 400
05 1661 456 150
06 1601 789 230
07 1663 123 200
08 1663 456 150
09 1603 789 230
SELECT ID, projectPeriodID, ItemisedGrantID, Amount FROM PROJECTITEMISEDGRANT
SELECT * FROM PROJECTITEMISEDGRANT where itemisedGrantID = 901 and ProjectPeriodID < 1661
I am trying to construct a query that will allow me to see the Sales amount for each itemedgrantid prior to the record i am on
ie if I was on record 7, i want to be able to see id, projectPeriodid, sales to date (the sum Amount for the current Itemisedgrant '123' with an id less than 7) then the current amount
so it would return something like this
ID ProjectPeriodID, ItemisedGrantID, Amount to date (not including this row) Amount
id 1663 123 1100 200
i think It is a corrolated sub-query that I need, but i have been trying for hours and cant seem to get my head round it, I would appreciate any help or pointers.
Cheers
October 23, 2008 at 6:21 am
[EDIT]
This was wrong. I was half asleep when I initially wrote it, and I get lazy when people don't provide DDL / Sample data. The following should work fine and is a set based solution using a derived table. If your data set is extremely small, you may not notice any difference between this and a correlated subquery, but as the data size grows, a correlated subquery is going to have more and more problems.
[font="Courier New"]CREATE TABLE #temp(
ID INT,
ProjectPeriodID INT,
ItemisedGrantID INT,
Amount INT)
INSERT INTO #temp(ID, ProjectPeriodID, ItemisedGrantID, Amount)
SELECT 01, 1660, 123, 700 UNION ALL
SELECT 02, 1660, 456, 150 UNION ALL
SELECT 03, 1600, 789, 230 UNION ALL
SELECT 04, 1661, 123, 400 UNION ALL
SELECT 05, 1661, 456, 150 UNION ALL
SELECT 06, 1601, 789, 230 UNION ALL
SELECT 07, 1663, 123, 200 UNION ALL
SELECT 08, 1663, 456, 150 UNION ALL
SELECT 09, 1603, 789, 230
SELECT P1.ProjectPeriodID, P1.ItemisedGrantID, P2.Total [Amount to Date], P1.Amount
FROM #temp P1
LEFT JOIN (SELECT ItemisedGrantID, SUM(Amount) Total FROM #temp WHERE ID < 7 GROUP BY ItemisedGrantID) P2 ON P1.ItemisedGrantID = P2.ItemisedGrantID
WHERE P1.ItemisedGrantID = '123' AND P1.ID = 7
-- OR --
SELECT P1.ID, P1.ProjectPeriodID, P1.ItemisedGrantID, P2.Total [Amount to Date], P1.Amount
FROM #temp P1
LEFT JOIN (SELECT ItemisedGrantID, SUM(Amount) Total FROM #temp WHERE ProjectPeriodID < 1663 GROUP BY ItemisedGrantID) P2 ON P1.ItemisedGrantID = P2.ItemisedGrantID
WHERE P1.ItemisedGrantID = '123' AND P1.ProjectPeriodID = 1663
[/font]
October 24, 2008 at 8:59 am
A correlated subquery is the way to go. They can be very useful.
select
ID
, ProjectPeriodID
, ItemisedGrantID
, Amount
, ISNULL (
(
-- This does the correlated sub query
SELECT sum (amount)
from ProjectPeriod c
where c.ItemisedGrantID = p.ItemisedGrantID
--This gets lower ids only
and c.id < p.id
)
,0)
from ProjectPeriod p
October 25, 2008 at 8:08 pm
Jim Nesbitt (10/24/2008)
A correlated subquery is the way to go. They can be very useful.select
ID
, ProjectPeriodID
, ItemisedGrantID
, Amount
, ISNULL (
(
-- This does the correlated sub query
SELECT sum (amount)
from ProjectPeriod c
where c.ItemisedGrantID = p.ItemisedGrantID
--This gets lower ids only
and c.id < p.id
)
,0)
from ProjectPeriod p
Correlated subqueries with inequalities can be a form of "Death by SQL". See the following for why...
http://www.sqlservercentral.com/articles/T-SQL/61539/
Here's an article on a very high speed alternative for doing grouped running totals...
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
--Jeff Moden
Change is inevitable... Change for the better is not.
October 27, 2008 at 8:46 am
The article referred to by Jeff is extremely interesting, and I will definitely make use of this technique.
However, I don't think a subquery solution is always bad. It would very much depend on how often the query was required and also how often the table was updated.
If the information was accessed occasionally for specific ids or periods, then the sub query wouldn't read the entire table and response should be fine. So no Death by SQL.
The alternate solution requires a permanent field and an update to the entire table. It would also be necessary to update the totals each time a row was changed or modified, and the development effort is higher compared to using the correlated subquery, so this also has to be considered.
I agree that a correlated subquery could be a dog if the information is required for all rows of a very large table, and I'm now hoping I'll get a chance to use the alternative some time - thanks for the heads up.
October 27, 2008 at 9:03 am
The alternate solution requires a permanent field and an update to the entire table. It would also be necessary to update the totals each time a row was changed or modified, and the development effort is higher compared to using the correlated subquery, so this also has to be considered.
Using a derived tables/cte's instead of the correlated subquery gets you the best of both worlds. See my above example. The difference is that a derived table is evaluated once, where as a correlated subquery is evaluated once... per row. That said, if she is only running one ID at a time, the difference is going to be negligible.
October 27, 2008 at 6:52 pm
The problem with even allowing such code to exist is that people will take any shortcut they can without regard to future performance. If they run across such code, they are likely to copy it for use in an application that will scale out a lot more and BOOM! Death by SQL.
If you want people to stop writing performance limited, non-scalable code in your database, stop giving them examples of bad code in your database to copy from. 😉 Justifying bad code by saying it's for a "limited number of rows" isn't the right thing to do... ever. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply