January 2, 2009 at 5:47 am
Excellent article on cursors, Wagner.
Here is a weird problem that made me choose a cursor. The incoming data was in a header / detail format. The header had 5 columns: ICN Code, Officer, Department, Function, and Message. Essentially, it is a Money Transfer by Wire application running on a Tandem system. The data provisioning was in pure TAB Delimited Text, so I used OPENROWSET and a format file to load the daily files of 50,000 rows.
But the rows were really weird. The header had data in 4 out of five columns, and the detail had data in one out of five columns. The rows between headers were the details for the prior header. Where the last column (Message) was NULL on the header marked the end of the last header - the Message being NULL was a sign that the it was a header row. Where the Message was not NULL there were all kinds of "field codes", where each field code has very odd and specific flags and options. Each row of the message had space for 2000 VARCHARS. Row counts varied between headers too - not all the headers had the same number of Message rows.
The problem was to pick up the Currency, Amount, Customer ID, and Originating institution from each of the Message Rows for each of the headers. I used a cursor, much to my dismay. It is slow, but it works and gives me fine-grained control.
I would like to try a set based approach too. If you pick up this comment, let me know, and I can send you more details.
Bob
January 2, 2009 at 6:08 am
I like the article. Showing common cursor practices and a nice set-based approach to replace them is absolutely the right way to go. Unfortunately, as Jeff already pointed out, you then try to slip out of the concept that cursors are, to say the least, problematic by this, "But when replacing cursors, we are letting go some important features that only cursors can provide to your code. You should not take the avoid-cursors advice as rule, but as a hint. Each situation deserves attention, as you can have a fast and well-designed code using cursors. And, in the other hand, you can also find poor-performance codes specially created to avoid cursors."
Adding it to, what was otherwise a really well done article, brings up issues. Beginners shouldn't be advised that avoiding TSQL cursors is a "hint." Far too much horrifically bad code has been written by beginners because they're not thinking in sets. By suggesting that they don't need to, you, as the expert who wrote the article, are saying that it's OK to write the cursor. There are situations where cursors work better. Itzik Ben-Gan shows several during his advanced TSQL course. But they're rather arcane & odd situations, not the normal, day-to-day, street level code that most people have to deal with. There are also situations where, because of other aspects of the code, cursors don't matter, such as running backups or consistency checks against a list of databases on the server. A cursor works fine there.
Mr. Roughgarden's comments can show the validly complex situation where a cursor makes sense, but, again, I suspect that's the exception, not the rule, in his code as well.
Making these types of broad, sweeping statements, especially after you've established, quite well, exactly why cursors are so problematic, pretty much demands that you back them up. So I assume you've got another article in the works?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 2, 2009 at 6:23 am
Well, I understand the fuzz, although I didn't mean to create it.
When I wrote this article I had in mind that not all T-SQL beginner are completely unaware of database design.
I have seen lots of bad code written by developers who simply were more familiar to other RDBMS's and took for granted that the truth for other RDBMS's is valid for SQL SERVER too.
Besides, I didn't mean novice database developers should not be scared about cursors. I tried to compare the performance of the (simple) alternatives I showed just to let them feel where they would be stepping on when using cursors.
But I also believe cursors are not so bad they shouldn't exist.
I've used cursors myself several times.
My point in this article is "Whenever you suspect your code is not running as expected, no matter it uses cursors or not, you should check your code's execution plan. This is the ultimate information you have to assess your code's performance."
January 2, 2009 at 6:34 am
wagner crivelini (1/2/2009)
Well, I understand the fuzz, although I didn't mean to create it.When I wrote this article I had in mind that not all T-SQL beginner are completely unaware of database design.
I have seen lots of bad code written by developers who simply were more familiar to other RDBMS's and took for granted that the truth for other RDBMS's is valid for SQL SERVER too.
Besides, I didn't mean novice database developers should not be scared about cursors. I tried to compare the performance of the (simple) alternatives I showed just to let them feel where they would be stepping on when using cursors.
But I also believe cursors are not so bad they shouldn't exist.
I've used cursors myself several times.
My point in this article is "Whenever you suspect your code is not running as expected, no matter it uses cursors or not, you should check your code's execution plan. This is the ultimate information you have to assess your code's performance."
Let me reiterate before I whup on you some more 😉 I think this is is a very well done article.
Unfortunately, another point you're bringing up now:
"Whenever you suspect your code is not running as expected, no matter it uses cursors or not, you should check your code's execution plan. This is the ultimate information you have to assess your code's performance."
Actually, no, it's not the ultimate. Unfortunately, the execution plan can present you with bad information. It can show you that an operation is costly when it is not, or cheap when the operation is expensive. I'm not saying you shouldn't use them. Hell, I wrote a book on them. I'm just saying that they're only one piece of the puzzle. You have to combine them with other information like I/O and execution times. Then, through repitition on the tests, you have the information you need.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 2, 2009 at 11:01 am
But when replacing cursors, we are letting go some important features that only cursors can provide to your code.
Assuming that we are talking about SQL Server 2005/2008, I know of no feature that "only cursors can provide", other than intentionally slowing down your code (which believe it or not, can be desirable). I am aware of some capabilities that only procedure-based loops (of which cursors are the least desirable) can provide, but they are vanishingly obscure and obtuse in real application.
All of the usual excuses for using cursors are based on misunderstandings of T-SQL capabilities and how SQL code should be designed and managed.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 2, 2009 at 11:13 am
JRoughgarden (1/1/2009)
OK, I'll take up the case for using cursors for some types of processing. I've posted below a copy of a stored procedure I just finished. It uses a read-only cursor to loop through a set of records indicating when each lot starts being processed in each of a facility's processing lines. For each row as appropriate, the code ultimately calls two other stored procedures that write 'Attribution' and 'Association' events to another database of queued events. Anyway, long story short, there is no way you could do this processing in a set-based manner. Period.
I see no functional requirement in this statement or its accompanying code that would require the use of a cursor. Do you have some proof, formal or informal, to the contrary?
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 2, 2009 at 11:40 am
...you should check your code's execution plan. This is the ultimate information you have to assess your code's performance
That is VERY VERY true when UDFs are involved!!! Some of them (the worst offenders) don't show up in the execution plan at all.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 2, 2009 at 11:48 am
In response to sudhanvag, who stated:
But i want a result like this
---------------------------------
codName (no column name)
abc 1 281 444 5555 , 55 11 4582 2752 , *********** ,
def 1 XXX XXX XXXXX ,
-----------------------------------
Let me offer the following solution:
select u.codName,
stuff((select ', ' + PhoneNumber
from tblPhone p1
where p1.codUser = p2.codUser
for xml path ('')), 1, 2, '')
from tblPhone p2
inner join tblUser u on p2.codUser = u.codUser
group by u.codName, p2.codUser
I'm open to any other suggestions on this. I was going to pose the very same question but sudhanvag beat me to it.
Lisa
P.S. Please see sudhanvag's post on page 2 where tblUser was added.
January 2, 2009 at 12:36 pm
Nice Article,
I don't think that useful and necessary cursors are that rare.
Can you loop through dm_db_index_physical_stats and reindex all the databases that require reindexing without an iterative structure? It's beyond my meager skills if it is possible.
It probably wouldn't be advisable to do it as a set-based op. because they'd all get reindexed at the same time causing upheaval and turmoil in the DB.
These sorts of things are the bread and butter of a scripting DBA. If there's a better way, someone show me.
~BOT
Craig Outcalt
January 2, 2009 at 12:41 pm
SQLBOT (1/2/2009)
Can you loop through dm_db_index_physical_stats and reindex all the databases that require reindexing without an iterative structure? It's beyond my meager skills if it is possible.It probably wouldn't be advisable to do it as a set-based op. because they'd all get reindexed at the same time causing upheaval and turmoil in the DB.
So, you're saying that you actually want it to run slower?
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 2, 2009 at 1:23 pm
RBarryYoung (1/2/2009)
SQLBOT (1/2/2009)
Can you loop through dm_db_index_physical_stats and reindex all the databases that require reindexing without an iterative structure? It's beyond my meager skills if it is possible.It probably wouldn't be advisable to do it as a set-based op. because they'd all get reindexed at the same time causing upheaval and turmoil in the DB.
So, you're saying that you actually want it to run slower?
LOL...Nope,
Just to see if it's possible. I imagine dozens of set-based advocates furiously coding this very minute to come up with a good way :hehe:
I make no assumptions that it will run slower because I have no real Idea of how it would execute.
Can Someone write a psuedo-iterative set-based query using cross-applied table valued functions that would reindex... maybe. It might even act iterative too and be faster than a cursor. It would be the logical absurdity of !(RBAR), but I'd like to see it.
~BOT
Craig Outcalt
January 2, 2009 at 1:37 pm
JRoughgarden (1/1/2009)
OK, I'll take up the case for using cursors for some types of processing. I've posted below a copy of a stored procedure I just finished. It uses a read-only cursor to loop through a set of records indicating when each lot starts being processed in each of a facility's processing lines. For each row as appropriate, the code ultimately calls two other stored procedures that write 'Attribution' and 'Association' events to another database of queued events. Anyway, long story short, there is no way you could do this processing in a set-based manner. Period.
It could also be argued that you simply wrote it all incorrectly from the git. 😉 Writing stored procedures like that are a guarantee that you won't be able to do anything with them other than using a cursor.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2009 at 1:40 pm
battelofhalfwits (1/1/2009)
Wow...Two things
1) Never use Cursors! Looping through a temp table takes less overhead than a Cursor; especially when you are working on a 24/7/365 server where you really need to watch your resources.
Not true... if you use Forward Only, Read Only (or sometimes Static), the cursor is just as "effecient" as the WHILE loop... and make no doubt about it, neither is effecient. If you use a WHILE loop, you've not done any better than using a cursor. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2009 at 1:43 pm
slippers (1/1/2009)
HiOne question: if considering whether to use Cursors or alternatives, how does trigger firing influence the decision?
I mean, when using a cursor to perform an insert statement with each loop, triggers on the target table will fire for each row being inserted.
When using alternatives, is there any way to ensure that triggers will process each insert?
Thanks
Heh... yes... like all the other code, you should write set based triggers. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2009 at 1:50 pm
sudhanvag (1/2/2009)
Extending the table structure given in the articlehttp://www.sqlservercentral.com/articles/cursors/65136/]I have added an extra table for UserListing.
CREATE TABLE dbo.tblUser ( codUser INT, codName VARCHAR(20),PRIMARY KEY (codUser));
insert into dbo.tblUser values (1, 'abc');
insert into dbo.tblUser values (2, 'def');
When i execute the following query
SELECT TU.codName ,-- @AllPhones = @AllPhones +
CASE WHEN P.ListThisNumber = 1
THEN P.PhoneNumber ELSE '***********'END
+ ' , '
FROM dbo.tblPhone P
INNER JOIN dbo.tblUser TU ON P.codUser = TU.codUser
INNER JOIN dbo.tblPhoneType PT ON P.codPhoneType = PT.codPhoneType
I will get result like this
---------------------------------
codName (no column name)
abc1 281 444 5555 ,
abc55 11 4582 2752 ,
abc*********** ,
def1 XXX XXX XXXXX ,
-----------------------------------
But i want a result like this
---------------------------------
codName (no column name)
abc1 281 444 5555 , 55 11 4582 2752 , *********** ,
def1 XXX XXX XXXXX ,
-----------------------------------
I know we can achieve the above result using cursors or CTE.
But i am expecting the above result with a query something like below:
DECLARE @AllPhones VARCHAR(1000)
SET @AllPhones = ''
SELECT TU.codName , @AllPhones = @AllPhones +
CASE WHEN P.ListThisNumber = 1
THEN P.PhoneNumber ELSE '***********'END
+ ' , '
FROM dbo.tblPhone P
INNER JOIN dbo.tblUser TU ON P.codUser = TU.codUser
INNER JOIN dbo.tblPhoneType PT ON P.codPhoneType = PT.codPhoneType
GO
But we will get an error
A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.
My question is, Is there any work around for this? (Without using cursors or CTE)
First of all, this is one of the worst things you can do in a database. Basically, you 're building a denormalized output and that kind of stuff should be done in the GUI where you can waste the clock cycles of the client instead of the database server to do this type of formatting. 😛
However, YES, there is a way to do it in SQL Server 2005 without a cursor and certainly without a WHILE loop. There's even a way to do it in SQL Server 2000 without a WHILE loop. Please see the following article for not only the "How to" for both a UDF and the XML method, but some caveats to avoid that will affect performance...
Performance Tuning: Concatenation Functions and Some Tuning Myths[/font][/url]
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 16 through 30 (of 87 total)
You must be logged in to reply to this topic. Login to reply