May 23, 2009 at 2:37 am
I have table
id
10
20
30
40
I want the result like
id
10
30
60
100
through query.
Pls help me to get result.
Regards,
Sunil
Sql DBA
May 23, 2009 at 2:53 am
You can use a inline query:
DECLARE @t TABLE (Id INT)
INSERT INTO @t
SELECT 10
UNION ALL SELECT 20
UNION ALL SELECT 30
UNION ALL SELECT 40
SELECT
(SELECT SUM(Id) FROM @t WHERE Id <= t.Id)
FROM @t t
May 23, 2009 at 3:02 am
Thanks a lot.
May 23, 2009 at 1:22 pm
That's a "running total" problem. Inline queries are ok for only the smallest of collections. See the following for why...
http://www.sqlservercentral.com/articles/T-SQL/61539/
If you have more than small collections of data to produce the running total for, post back and I'll introduce you to the quirky update method.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 23, 2009 at 1:26 pm
Good point Jeff. If there are more than four rows 😀 an inline query gets a bad performance.
One question:
Did you figure out if a cross join performs better than an inner join in this case?
May 23, 2009 at 2:35 pm
Florian Reischl (5/23/2009)
Good point Jeff. If there are more than four rows 😀 an inline query gets a bad performance.One question:
Did you figure out if a cross join performs better than an inner join in this case?
Heh... the problem is that we just don't know how many rows the OP will use this for nor how many some other person who reads this thread may use it for. 😉
Actually, if you look at the execution plan, it gets comparatively bad performance even on 4 rows. It has to "touch" ((N2+N)/2)+N rows or 14 rows just to process 4. Neither an INNER JOIN nor a CROSS JOIN nor even a CROSS APPLY is capable of processing this type of query with any worth while real performance. It must necessarily be procedural code in order to process the running total with any real performance.
There are two ways to process this code with performance and both are procedural in nature... either a Cursor/While Loop (same as just using a While Loop if the Cursor is properly declared) or the quirky update. Let's try an experiment. First, a small table of only 10,000 rows of test data...
[font="Courier New"]--DROP TABLE #JBMTest
GO
--===== Create and populate a 10,000 row test table.
-- Column "RowNum" has a range of 1 to 100,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Jeff Moden
SELECT TOP 10000
SomeID = IDENTITY(INT,1,1),
SomeInt = ABS(CHECKSUM(NEWID()))%50000+1
INTO #JBMTest
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE #JBMTest
ADD PRIMARY KEY CLUSTERED (SomeID)[/font]
Now, the inline code to solve the running total problem...
[font="Courier New"]--===== Solve the running total problem using inline code.
-- This is what is known as a "Triangular Join"
SELECT SomeID, SomeInt,
(SELECT SUM(t2.SomeInt) FROM #JBMTest t2 WHERE t2.SomeID <= t1.SomeID) AS RunningTotal
FROM #JBMTest t1
ORDER BY t1.SomeID[/font]
That relatively simple code produces 50,015,000 internal rows ((10,0002+10,000)/2)+10,000 and that's why it takes more than 41 seconds to run on my humble desktop computer. Worse yet, it took more than 37 CPU seconds and more than 130 thousand Reads.
Let's do it with the quirky update:
[font="Courier New"]--===== Conditionally drop the temporary work table
IF OBJECT_ID('TempDB..#MyWork') IS NOT NULL
DROP TABLE #MyWork
--===== Create and populate the work table on the fly
SELECT SomeID,
CAST(SomeInt AS BIGINT) AS SomeInt,
CAST(0 AS BIGINT) AS RunningTotal
INTO #MyWork
FROM #JBMTest
ORDER BY SomeID
--===== Add the correct clustered index for this problem
ALTER TABLE #MyWork
ADD PRIMARY KEY CLUSTERED (SomeID)
--===== Declare a couple of obviously named variables
DECLARE @RunningTotal BIGINT,
@Anchor INT
SELECT @RunningTotal = 0
--===== The procedural code in the form of the quirky update
-- to calculate the running total in the work table.
UPDATE #MyWork
SET @RunningTotal = RunningTotal = SomeInt + @RunningTotal,
@Anchor = SomeID
FROM #MyWork WITH(INDEX(0),TABLOCKX)
--===== Display the results in the correct order
SELECT *
FROM #MyWork
ORDER BY SomeID
[/font]
That little slice of computational heaven took only 629 milliseconds duration, 172 milliseconds of CPU time, and only 702 Reads.
If you up the ante to a million rows, it'll take hours, maybe even days for the inline code to run because it will generate 500,001,500,000 (500 BILLION) internal rows where the quirky update runs in only 54.5 seconds [font="Arial Black"]including [/font]the time it takes to display the million row result. Without the display, it only takes 31 seconds to execute.
Since the inline query increases the number of rows it has to touch internally in a logrithmic fashion, the inline code will only get much worse as the rowcounts climb. For example, at the 10,000 row level, the inline code took about 41 seconds. At the 20,000 row level, it takes ... (still waiting... I'll get back to this in a couple of minutes...)
--Jeff Moden
Change is inevitable... Change for the better is not.
May 23, 2009 at 2:55 pm
The inline code jumped from 41 seconds at 10,000 rows to 170 seconds for 20,000 rows. In other words, the amount of time increases by a factor of 4 everytime the rowcount doubles.
The quirky update is virtually linear until it reaches the "tipping point" (somewhere in the millions of rows) of each computer.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 23, 2009 at 3:11 pm
Great job Jeff!! I didn't think about a quirky update in this case. Also very clean to use an tab-lock and an index hint to ensure the order!
Maybe it's time for your 5 bugs 😀
Just in respect to the completeness, here the INNER JOIN and CROSS JOIN solutions I spoke about. Performance of both is about 5 seconds duration, 17 seconds CPU and 130,132 logical reads.
[font="Courier New"]
--SET STATISTICS IO ON
--SET STATISTICS TIME ON
---============================================
-- Using an INNER JOIN to calculate the totals
SELECT
t1.SomeID,
t1.SomeInt,
SUM(t2.SomeInt)
FROM #JBMTest t1
JOIN #JBMTest t2 ON t1.SomeID >= t2.SomeID
GROUP BY
t1.SomeID,
t1.SomeInt
--SET STATISTICS TIME OFF
--SET STATISTICS IO OFF
GO
--SET STATISTICS IO ON
--SET STATISTICS TIME ON
---============================================
-- Using a CROSS JOIN to calculate the totals
SELECT
t1.SomeID,
t1.SomeInt,
SUM(t2.SomeInt)
FROM #JBMTest t1
CROSS JOIN #JBMTest t2
WHERE
t1.SomeID >= t2.SomeID
GROUP BY
t1.SomeID,
t1.SomeInt
--SET STATISTICS TIME OFF
--SET STATISTICS IO OFF
[/font]
By the way
The inline query took 41 seconds on your system?? Do you use an old Atari? :laugh:
10,000 rows took about 12 seconds on my computer.
Greets
Flo
May 23, 2009 at 7:18 pm
Very cool, Flo. We're getting all of the running total methods out in the open. Care to write a CLR for this bad boy?
The INNER JOIN method, which is still a triangular join, took 72 seconds on 10,000 rows. It spawned an identical number of internal rows as the inline method. The part that took even longer was that all those rows made it to loop that combines the rows from the SUM with the base rows.
Because of the criteria used, the CROSS JOIN method has an actual execution plan identical to that of the Inner Join method except that it took 80 seconds on 10,000 rows.
Both methods will suffer the same exponential growth of internal rows as the Inline Method which is really a correlated sub-query.
Speaking of correlated sub-queries, I also tried the CROSS APPLY method with the same results as the Inline Method because there's virtually no difference between the two methods... they're both correlated sub-queries. The execution plans are quite differenent but the number of internal rows spawned are identical.
Here's the code for the CROSS APPLY...
[font="Courier New"] SELECT t1.SomeID,
t1.SomeInt,
ca.RunningTotal
FROM #JBMTest t1
CROSS APPLY (SELECT SUM(t2.SomeInt) AS RunningTotal
FROM #JBMTest t2
WHERE t1.SomeID >= t2.SomeID) ca[/font]
And, heh... yeah... all of these prove that it's getting to be time to replace my tried and trusted 7 year old war club. It's a P4 single 1.8Ghz CPU with 7 year old disk technology and only a Gig of RAM. The really cool part about it is, if I write code that's fast on my machine, it absolutely flies on any servers my customers may have.
I might as well write the cursor solution for this 10k row problem to round out all the solutions, as well. I have to find my mouth wash, first. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
May 23, 2009 at 7:31 pm
Florian Reischl (5/23/2009)
Also very clean to use an tab-lock and an index hint to ensure the order!
Just so long as no one thinks they can do that with a non-clustered index, we'll be all set. It's mostly for looks because updates always take the path of the clustered index even if the columns in the update have nothing to do with the clustered index.
Maybe it's time for your 5 bugs 😀
[/quote]
Ok... I'll bite. What do you mean here? I blame it on failing memory and I'm not talking about my 7 year old computer. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
May 24, 2009 at 9:03 am
Jeff Moden (5/23/2009)
Very cool, Flo. We're getting all of the running total methods out in the open.
Yep. You know I always like to compare all solutions. Just like you.
Care to write a CLR for this bad boy?
I don't think that CLR has any chance in this context. There is currently another thread where CLR beats T-SQL here. Join if you are interested 🙂
Yeah, all the other solutions (INNER JOIN/CROSS JOIN/CROSS APPLY/Inline query) seem to have the same performance problem with exponential resource requirements. But a CURSOR (well a WHILE loop, I almost never use a native SQL cursor) seems to work linear and quiet fast in this case. Sure it's still much slower as your quirky update, but it beats out all the others.
Here the cursor approach I tried:
[font="Courier New"]---============ Cursor ======================
SET NOCOUNT ON
---====================================================
-- Create a result table
IF (OBJECT_ID('tempdb..#CursorResult') IS NOT NULL)
DROP TABLE #CursorResult
CREATE TABLE #CursorResult
(
SomeInt INT NOT NULL
PRIMARY KEY CLUSTERED,
RunningTotal BIGINT
)
---=====================================================
-- Some variables for the current Id and the running totals
DECLARE @SomeId INT
DECLARE @RunningTotal BIGINT
SELECT
@SomeId = 0,
@RunningTotal = 0
---======================================================
-- Loop through all rows of the source table to calculate the totals
WHILE (1 = 1) -- Break will be done inside the loop
BEGIN
SELECT TOP(1)
@SomeId = SomeId,
@RunningTotal = @RunningTotal + SomeInt
FROM #JBMTest
WHERE SomeId > @SomeId
ORDER BY SomeId
IF (@@ROWCOUNT = 0)
BREAK
INSERT INTO #CursorResult
SELECT @SomeId, @RunningTotal
END
---====================================================
-- Client side output just like you, Jeff
SELECT * FROM #CursorResult
[/font]
And, heh... yeah... all of these prove that it's getting to be time to replace my tried and trusted 7 year old war club. It's a P4 single 1.8Ghz CPU with 7 year old disk technology and only a Gig of RAM.
7 years old PC/laptop? I think it's time to say goodbye. I'm sure you both had good times and bad times but sometimes you have to let loose :-D. I currently ride a 2,4 GHz Quad with 8 GB RAM which was really cheap. It's a desktop, no laptop.
The really cool part about it is, if I write code that's fast on my machine, it absolutely flies on any servers my customers may have.
Er.. I know that statement - from my previous chief - what makes me a bit scary :w00t:
Best wishes
Flo
May 24, 2009 at 9:08 am
Jeff Moden (5/23/2009)
Florian Reischl (5/23/2009)
Maybe it's time for your 5 bugs 😀Ok... I'll bite. What do you mean here? I blame it on failing memory and I'm not talking about my 7 year old computer. 😀
Start at your post and follow the answers:
http://www.sqlservercentral.com/Forums/FindPost718718.aspx
😉
May 24, 2009 at 11:02 am
Florian Reischl (5/24/2009)
Jeff Moden (5/23/2009)
Florian Reischl (5/23/2009)
Maybe it's time for your 5 bugs 😀Ok... I'll bite. What do you mean here? I blame it on failing memory and I'm not talking about my 7 year old computer. 😀
Start at your post and follow the answers:
http://www.sqlservercentral.com/Forums/FindPost718718.aspx
😉
BWAA-HAAA!!! NOT 5 Bugs, 5 BUCKS! Guess we're even, huh? 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
May 24, 2009 at 11:10 am
Jeff Moden (5/24/2009)
Florian Reischl (5/24/2009)
Jeff Moden (5/23/2009)
Florian Reischl (5/23/2009)
Maybe it's time for your 5 bugs 😀Ok... I'll bite. What do you mean here? I blame it on failing memory and I'm not talking about my 7 year old computer. 😀
Start at your post and follow the answers:
http://www.sqlservercentral.com/Forums/FindPost718718.aspx
😉
BWAA-HAAA!!! NOT 5 Bugs, 5 BUCKS! Guess we're even, huh? 😛
ROFL!!!
I have to increase my English!
:laugh:
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply