July 20, 2010 at 4:22 am
The cursor below runs within a dts package and obviously takes quite a while to run over approx 6 million records. Would it be possible to replace the cursor with update scripts? Any ideas to replace the cursor would be appreciated. This is an inherited package running within SQL2000.
SET NOCOUNT ON
DECLARE @income money,
@cost money,
@costAdd money,
@order_id bigint,
@order_value bigint,
@order_line_no bigint,
@promoID int,
@PromoCheck int,
@entCode bigint
DECLARE hhrsales_cursor CURSOR FORWARD_ONLY FOR
SELECT net_income, cost, PL_Order, PL_Txno, PL_Order_Value, PromotionID, PromoCheckID, Entity_Code
FROM tmp_hhrsales2
FOR UPDATE of [Cost]
OPEN hhrsales_cursor
FETCH NEXT FROM hhrsales_cursor
INTO @income, @cost, @order_id, @order_line_no, @order_value, @promoID, @promoCheck, @entCode
set @costAdd = 0
WHILE @@FETCH_STATUS = 0
BEGIN
IF (@order_value <> 0) and (@promoCheck <> 60)
BEGIN
IF @entCode <> 10000001054
BEGIN
IF @income = 0 and @cost <> 0
BEGIN
set @costAdd = @costAdd + @cost
UPDATE tmp_hhrsales2
set cost = 0
WHERE CURRENT OF hhrsales_cursor
insert into log_hhrsales_cos_change
select @order_id, @order_line_no, @cost, 0
END
IF @income <> 0 and @costAdd <> 0
BEGIN
UPDATE tmp_hhrsales2
SET cost = cost + @costAdd
WHERE CURRENT OF hhrsales_cursor
insert into log_hhrsales_cos_change
select @order_id, @order_line_no, @cost, @cost+@costAdd
set @costAdd = 0
END
END
END
-- Get the next record.
FETCH NEXT FROM hhrsales_cursor
INTO @income, @cost, @order_id, @order_line_no, @order_value, @promoID, @promoCheck, @entCode
END
CLOSE hhrsales_cursor
DEALLOCATE hhrsales_cursor
GO
July 20, 2010 at 4:57 am
Have a look at the link below:
July 20, 2010 at 7:18 am
At a glance it looks like the cursor is there to enforce an insert into a logging table. Is there any reason why this could not be done with a SQL trigger?
If you can use a trigger then it would just be a standard update statement - obviously you would need to include all the IF statements into a WHERE clause - which incidently should have been done in the SELECT statement to populate the cursor.
Rich
July 20, 2010 at 7:28 am
Don't know why a trigger or any other option was not used, just inherited the package.
Could you give an example script of your idea pls?
July 20, 2010 at 7:51 am
The UPDATE of the table is a special kind called a Running Totals Update. Jeff Moden has written some excellent and easily consumable articles about them for this site, well worth a read, and you will need to invest a little time on them. Your whole batch could be replaced by an UPDATE and an INSERT ... FROM, something like this:
UPDATE tmp_hhrsales2 SET
@costAdd = CASE
WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost
WHEN net_income <> 0 and @costAdd <> 0 THEN 0
ELSE @costAdd END,
cost = CASE
WHEN net_income = 0 and cost <> 0 THEN 0
WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd
ELSE cost END
-- extra columns for log_hhrsales_cos_change: oldcost and newcost
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
INSERT INTO log_hhrsales_cos_change
SELECT PL_Order, PL_Txno, oldcost, newcost
FROM tmp_hhrsales2
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 20, 2010 at 10:07 pm
Denesh Naidoo (7/20/2010)
Sorry but absolutely not. It's not the cursor that makes things slow... it's the WHILE LOOP. Converting a cursor to anything that has a WHILE LOOP is a futal effort because a WHILE LOOP is no faster than a well written Forward_Only, Read_Only, Static cursor.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 20, 2010 at 10:17 pm
Chris Morris-439714 (7/20/2010)
The UPDATE of the table is a special kind called a Running Totals Update. Jeff Moden has written some excellent and easily consumable articles about them for this site, well worth a read, and you will need to invest a little time on them. Your whole batch could be replaced by an UPDATE and an INSERT ... FROM, something like this:
UPDATE tmp_hhrsales2 SET
@costAdd = CASE
WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost
WHEN net_income <> 0 and @costAdd <> 0 THEN 0
ELSE @costAdd END,
cost = CASE
WHEN net_income = 0 and cost <> 0 THEN 0
WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd
ELSE cost END
-- extra columns for log_hhrsales_cos_change: oldcost and newcost
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
INSERT INTO log_hhrsales_cos_change
SELECT PL_Order, PL_Txno, oldcost, newcost
FROM tmp_hhrsales2
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
Oh, be careful, Chris...
I might be missing something but I believe you need to make two minor changes to the code to guarantee that nothing interferes with the calculations including parallelism...
UPDATE tmp_hhrsales2 SET
@costAdd = CASE
WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost
WHEN net_income <> 0 and @costAdd <> 0 THEN 0
ELSE @costAdd END,
cost = CASE
WHEN net_income = 0 and cost <> 0 THEN 0
WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd
ELSE cost END
-- extra columns for log_hhrsales_cos_change: oldcost and newcost
FROM tmp_hhrsales2 WITH(TABLOCKX) --<<< HERE
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
OPTION (MAXDOP 1) --<<< AND HERE
INSERT INTO log_hhrsales_cos_change
SELECT PL_Order, PL_Txno, oldcost, newcost
FROM tmp_hhrsales2
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
Also, there needs to be a clustered index on that table in the order that the update is supposed to be executed.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 21, 2010 at 1:56 am
HAve attached the table attributes which includes indexes. Pls let me know your thoughts.
What do the two changes you've made do, how do they affect the data??
July 21, 2010 at 2:24 am
Jeff Moden (7/20/2010)
Chris Morris-439714 (7/20/2010)
The UPDATE of the table is a special kind called a Running Totals Update. Jeff Moden has written some excellent and easily consumable articles about them for this site, well worth a read, and you will need to invest a little time on them. Your whole batch could be replaced by an UPDATE and an INSERT ... FROM, something like this:Oh, be careful, Chris...
I might be missing something but I believe you need to make two minor changes to the code to guarantee that nothing interferes with the calculations including parallelism...
Also, there needs to be a clustered index on that table in the order that the update is supposed to be executed.
Thanks Jeff, you're missing nothing, those are important points and it does no harm to reiterate them here.
The OP's query used to build the result set on which the cursor operates has no WHERE clause and yet it's a running totals update with partitions where variables are reset to zero. Even if there is a clustered index on the table in the order that the update is supposed to be executed, there's no guarantee that the rows will be FETCHed in the correct order...
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 21, 2010 at 5:37 am
LadyG (7/21/2010)
HAve attached the table attributes which includes indexes. Pls let me know your thoughts.What do the two changes you've made do, how do they affect the data??
The changes I've proposed will do the same as the existing code but considerably faster. The code is unlikely to work correctly as it stands - there's no sample data to test against - you will need to amend and test it yourself. Jeff's article[/url] will guide you in the right direction. Notice where he's put comments into my code in his post - these are important, you will find a full explanation in the article.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 22, 2010 at 1:36 am
Hi Chris,
Does this means I will have to introduce a new column, newcost, to the tmp_hhrsales2 table?
July 22, 2010 at 2:06 am
LadyG (7/22/2010)
Hi Chris,Does this means I will have to introduce a new column, newcost, to the tmp_hhrsales2 table?
You will need at least one new column, probably oldcost. If you can describe the table as being offline from the production tables for the purposes of performing this update, then it shouldn't be an issue.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 22, 2010 at 4:24 am
Hi Chris,
Script doesn't quite work. I'm not getting the values expects, getting 0s and nulls. What is wrong?? Have attached attributes for both tables.
SET NOCOUNT ON
DECLARE @income money,
@cost money,
@costAdd money,
@order_id bigint,
@order_value bigint,
@order_line_no bigint,
@promoID int,
@PromoCheck int,
@entCode bigint,
@new_cost money
UPDATE tmp_hhrsales2 SET
@costAdd = CASE
WHEN net_income = 0 and cost <> 0 THEN @costAdd + @cost
WHEN net_income <> 0 and @costAdd <> 0 THEN 0
ELSE @costAdd END,
cost = CASE
WHEN net_income = 0 and cost <> 0 THEN 0
WHEN net_income <> 0 and @costAdd <> 0 THEN cost + @costAdd
ELSE cost END
-- extra columns for log_hhrsales_cos_change: oldcost and newcost
FROM tmp_hhrsales2 WITH(TABLOCKX)
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
OPTION (MAXDOP 1)
INSERT INTO log_hhrsales_cos_change
SELECT PL_Order, PL_Txno, oldcost, new_cost
FROM tmp_hhrsales2
WHERE PL_Order_Value <> 0 and PromoCheckID <> 60 AND Entity_Code <> 10000001054
July 22, 2010 at 4:30 am
In the absence of scripts to create the tables and populate them with data, it's very difficult to tell.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 22, 2010 at 4:39 am
The problem is with the last part of the script, insert & select. Obvioulsy, as new_cost doesn't exist in the sales table, the calculated data needs to go somewhere in order to extract into the log table. Or am I completely on the wrong train of thought?
Viewing 15 posts - 1 through 15 (of 32 total)
You must be logged in to reply to this topic. Login to reply