October 18, 2011 at 7:34 am
I have code to create cummulative percents. Unfortunately, it is running extrememly slowly. Here is the code:
declare @table as table (name varchar(15), sales numeric (5,0))
Insert into @table (name, sales)
Values ('John', 10),
('Jennifer', 15),
('Stella', 20),
('Sophia', 40),
('Greg',50),
('Jeff', 20);
SELECT a1.name, a1.sales, SUM(a2.sales)/(SELECT SUM(sales) from @table) as PctToTotal
FROM @table a1, @table a2
WHERE a1.sales <= a2.sales or (a1.sales=a2.sales and a1.name =a2.name)
group by a1.name, a1.sales
order by a1.sales desc, a1.name desc
This runs fine for small tables, but when I switched to actual work tables (~ 500,000 rows) it is way too slow. Going on several hours now. Anyone know a better/faster formula?
Thank you, Amy
October 18, 2011 at 7:54 am
Test your code, with this variation. Instead of using a TABLE VARIABLE (@Table) use a Temp Table (#Table)
Look at the QOD posted on 10/03/2011 at
http://www.sqlservercentral.com/Forums/Post.aspx?SessionID=mqk3w345ndswzpnnc0a4njj4
October 18, 2011 at 8:01 am
bitbucket-25253 (10/18/2011)
Test your code, with this variation. Instead of using a TABLE VARIABLE (@Table) use a Temp Table (#Table)
Ron, That code is just for example on the forum. The actual code is using a temp table (#table).
October 18, 2011 at 8:15 am
October 18, 2011 at 12:29 pm
Read the article and re-wrote the code. I now know I was trying to do a "triangular join". The performance went from nearly a full day (it never did finish, I put it out of its misery) to 10 seconds. 😀 I'm not sure if you can beat that for performance. Thank you!!!
October 18, 2011 at 12:43 pm
Amy.G (10/18/2011)
Read the article and re-wrote the code. I now know I was trying to do a "triangular join". The performance went from nearly a full day (it never did finish, I put it out of its misery) to 10 seconds. 😀 I'm not sure if you can beat that for performance. Thank you!!!
Did you read the discussion as well? I think there may have been some code changes implemented during the discussion. I haven't read the article lately, so I'm not sure if the rewrite has been rewritten again.
October 18, 2011 at 9:53 pm
Amy.G (10/18/2011)
Read the article and re-wrote the code. I now know I was trying to do a "triangular join". The performance went from nearly a full day (it never did finish, I put it out of its misery) to 10 seconds. 😀 I'm not sure if you can beat that for performance. Thank you!!!
Amy... Lynn is correct. There have been some improvements to the Quirky Update code mostly for added safety. If you'd post the code you ended up with, I'd be happy to do a double check on it for you.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 19, 2011 at 8:37 am
I did see the changes, and hopefully incorporated them. I should go back and read it again though; I was so excited it worked, but realize I don't fully understand it, like what does MAXDOP mean?
But here is the code I used:
--create clustered index for cum counts
CREATE CLUSTERED INDEX IXC_LED_Data
ON #led (ordering_value, instance_count);
--===== Supress the auto-display of rowcounts for speed an appearance
SET NOCOUNT ON
--===== Declare the working variables
DECLARE @PrevInstance int
DECLARE @RunningPercent numeric(20,15)
--===== Update the running total and running percent for this row using the "Quirky
-- Update" and a "Pseudo-cursor". The order of the UPDATE is controlled by the
-- order of the clustered index.
UPDATE #led
SET @RunningPercent = RunningPercent = CASE
WHEN Instance_count = THEN @RunningPercent + Percents
ELSE Percents
END,
@PrevInstance = Instance_count
FROM #led WITH (TABLOCKX)
OPTION (MAXDOP 1)
GO
November 24, 2011 at 2:49 pm
Amy.G (10/19/2011)
I did see the changes, and hopefully incorporated them. I should go back and read it again though; I was so excited it worked, but realize I don't fully understand it, like what does MAXDOP mean?But here is the code I used:
--create clustered index for cum counts
CREATE CLUSTERED INDEX IXC_LED_Data
ON #led (ordering_value, instance_count);
--===== Supress the auto-display of rowcounts for speed an appearance
SET NOCOUNT ON
--===== Declare the working variables
DECLARE @PrevInstance int
DECLARE @RunningPercent numeric(20,15)
--===== Update the running total and running percent for this row using the "Quirky
-- Update" and a "Pseudo-cursor". The order of the UPDATE is controlled by the
-- order of the clustered index.
UPDATE #led
SET @RunningPercent = RunningPercent = CASE
WHEN Instance_count = THEN @RunningPercent + Percents
ELSE Percents
END,
@PrevInstance = Instance_count
FROM #led WITH (TABLOCKX)
OPTION (MAXDOP 1)
GO
Apologies for the very late reply. I lost track of this thread.
You've not incorporated the new "safety counter" but, since you're doing this in a Temp Table, you won't have a problem with this code because you've followed all the other rules.
I invite you to lookup OPTION(MAXDOP 1) in Books Online for additional clarification but, as stated in the article, it prevents parallelism from occuring which could destroy the serial nature of the update. MAXDOP stands for "MAX Degree Of Parallelism".
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply