March 15, 2005 at 8:16 pm
Hi all,
I'm rewriting legacy code (DataFlex) as stored procedures for SQL Server. I just wrote the following procedure in a straight forward fashion, which requires several updates to the same table. My primary goal is speed of execution, not elegance. Can this be optimized? If so, can you offer some tips on how to rewrite it? I'm still new to T-SQL, and haven't yet figured out how to control a 'SET' statement with an IF or CASE.
Any suggestions are welcome.
==============================================
CREATE PROCEDURE rp_SOD_Update_Media
AS
-- ~~~~ Calculate & Store avg_sale ~~~~~~~~~~~~~
UPDATE MASMEDIA
SET avg_sale = (ytd_sales / num_orders)
WHERE inactive = 'N'
AND num_orders > 0
AND (ytd_sales / num_orders) < 10000 ;
-- -- -- -- -- -- -- -- -- -- -- -- -
UPDATE MASMEDIA
SET avg_sale = 9999.99
WHERE inactive = 'N'
AND num_orders > 0
AND (ytd_sales / num_orders) >= 10000 ;
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- ~~~~ Calculate & Store avg_profit ~~~~~~~~~~~
UPDATE MASMEDIA
SET avg_profit = (ytd_profit / num_orders)
WHERE inactive = 'N'
AND num_orders > 0
AND (ytd_profit / num_orders) < 10000 ;
-- -- -- -- -- -- -- -- -- -- -- -- -
UPDATE MASMEDIA
SET avg_profit = 9999.99
WHERE inactive = 'N'
AND num_orders > 0
AND (ytd_profit / num_orders) >= 10000 ;
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- ~~~~ Calculate & Store percent_sales ~~~~~~~~
UPDATE MASMEDIA
SET percent_sales = ((num_orders / qty_mailed) * 100)
WHERE inactive = 'N'
AND media_type = 'P'
AND qty_mailed != 0 --QUESTION: Possible qty_mailed != 0 & still need calc percent_sales?
AND num_orders != qty_mailed
AND ((num_orders / qty_mailed) * 100) < 10000 ;
-- -- -- -- -- -- -- -- -- -- -- -- -
UPDATE MASMEDIA
SET percent_sales = 9999.99
WHERE inactive = 'N'
AND media_type = 'P'
AND qty_mailed != 0
AND num_orders != qty_mailed
AND (num_orders / qty_mailed) >= 10000 ;
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- ~~~~ Calculate & Store percent_inqu ~~~~~~~~
UPDATE MASMEDIA
SET percent_inqu = ((num_inqu / qty_mailed) * 100)
WHERE inactive = 'N'
AND media_type = 'P'
AND qty_mailed != 0
AND num_inqu != qty_mailed
AND ((num_orders / qty_mailed) * 100) < 10000 ;
-- -- -- -- -- -- -- -- -- -- -- -- -
UPDATE MASMEDIA
SET percent_inqu = 9999.99
WHERE inactive = 'N'
AND media_type = 'P'
AND qty_mailed != 0
AND percent_inqu != qty_mailed
AND (num_orders / qty_mailed) >= 10000 ;
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--EOF
Any suggestions are welcome,
Greg
March 15, 2005 at 8:53 pm
You can easily join updates that have similar where clauses except for a single condition such as the four update statements for avg_sale and avg_profit:
UPDATE MASMEDIA
SET avg_sale = CASE
WHEN (ytd_sales / num_orders) < 10000 THEN (ytd_sales / num_orders)
ELSE 9999.99
END
,avg_profile = CASE
WHEN (ytd_profit / num_orders) < 10000 THEN (ytd_profit / num_orders)
ELSE 9999.99
END
WHERE inactive = 'N'
AND num_orders > 0
It looks like the percent_sales and percent_inqu updates can similarly be combined into a single UPDATE.
The CASE statement is very useful for combining multiple queries in this way. Check out Books Online for the full syntax.
Hope this helps,
Scott Thornburg
March 16, 2005 at 12:35 am
The CASE statement is very useful, indeed, although I don't think it will increase the performance in this particular case.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 16, 2005 at 2:32 am
Jeff,
Given the information available, it's not possible to say whether or not performance will improve. This depends on the method of access of the table MASMEDIA. If the UPDATEs are doing table scans, then the performance should scale by the number of update statements. Cutting these by a factor of 4 should improve the performance by approximately that same factor. I've seen this type of situation many times.
However, without more knowledge of the table counts and indexes, it's difficult to say so for certain. I pose my suggestion as something for Greg to investigate rather than a guaranteed enhancement.
On a more basic note, there may be benefit in indexing num_orders, if non-zero orders are rare (less than ~10%). Also on qty_mailed if non-zero is rare, but you may need to change the != 0 to a >0 as well (unless qty_mailed can be negative).
Scott Thornburg
March 16, 2005 at 12:44 pm
Strange, I posted a reply, where did it go? Here goes again...
**********************************************************
Wow, that's exactly the kind of information I'm looking for. I didn't know a CASE statament could be used that way. I tried a SELECT, but went about it backwards, trying to embed SET inside of IF/ELSE clauses.
I'm learning as I go, of course, and many of the examples I'm fnding in books & online tend to be extremely simplistic. BooksOnline has lots of good info, but their examples would never have led me to the kind of solution that Scott presents here. Can anyone recommend a source for more realistic/advanced examples than the typical 1+1=2 kind?
Also, are there any rules of thumb about what operations tend to be costly and which tend not to be?
I really appreciate all the help,
Greg Norris
March 16, 2005 at 7:13 pm
Scott,
No offense was meant and I agree except that I've seen the optimizer give the non-Case method the win 2 out of 3 times when testing with and without proper indexes on 3 similar updates. Use of the CASE statement in this instance may or may not help depending on a lot of conditions that neither of us know. I guess what I really meant to say in my previous post was that there is no substitute for load testing when performance is a must.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 16, 2005 at 7:43 pm
And no offense taken. It looks like we both agree that when it comes to performance improvements, empirical testing is a must. There is far more that can impact performance that is outside the simple statement of a query.
My hope is to provide some options for Greg so he can test them in his environment.
Thanks much for the clarification. I do like this community of SQL Server professionals.
March 17, 2005 at 2:51 am
We've seen a lot of performance gain by using a temporary tables
You could use one in place of MASMEDIA, select the data you want to update into it at the start, update the temporary table and then do a single update to MASMEDIA.
However, (and it is a big however!), it depends on the size of the dataset that you will be updating compared to the number of rows in the whole table. This method probably only works where the number of rows being updated is small.
In our application, we typically need to update a few thousand rows where the table contains millions, so it may not work for you if you are updating a large number of rows.
March 17, 2005 at 4:14 pm
I'm surprised to see no mention of indexes, I don't know how an equation in a WHERE clause would affect things, but I would think that making sure the inactive and num_orders fields were indexed would help.
I like the concept of the temporary tables if your record sets are large enough to benefit from such a technique. I'll have to remember that one.
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
March 17, 2005 at 4:15 pm
And my sympathy for having been stuck in DataFlex, I've been there. But I'm a weirdo, I liked Wang PACE and Cobol. 🙂
-----
[font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]
March 17, 2005 at 8:23 pm
I mentioned indexes (rather, the lack of) in my last reply to Scott.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply