Speed: Insert select stmnt vs Inserting via cursor

Question

Speed: Insert select stmnt vs Inserting via cursor

Chris_P

SSCommitted

Points: 1753
More actions
January 6, 2003 at 1:36 pm

#105985

Hi Guys.
Question for you Gurus.
I'm working on a database for a ticketing system. I have to write a stored proc that, based on financial transactions updates some customer information. (Pretty standard stuff) This proc will be scheduled and run nightly, the server is quite powerful and at that time there won't be any users on so resources aren't a big issue, but still - I'm curious.
I'm trying to decide the best way to do this so I did some benchmarking. I don't understand the results, and I ran the test several times. (BTW I'm using SQL 2k no sp on Win 2k sp3)
For both methods I created a global temp table with stripped down transaction information. I made it a global table so that both methods would use the same data.
I deleted all of the rows from the table that the methods insert into before running each time (868 rows). (Both methods included below) and both code snippets were run from my computer, in separate Query Analyzer windows.
The cursor took 9 seconds and the insert statement took 1m 16 seconds!!!! Most documentation/papers I've read say to avoid cursors like the plague (due to their slowness and resource usage).
Can anyone tell me why the cursor is faster? and maybe how one decides wether to use a cursor or a couple insert statements? This is only a small snippet from the proc.
Thanks, Chris.
**** Method 1 was a cursor:
Declare @Const int, @Cat int
Select @Const = 4
Select @Cat = 2
Declare Cursor1 CURSOR FAST_FORWARD for
Select distinct
trn.customer_no,
(select Rank from tr_constituency where id = @Const) Rank
From ##tblTRN trn inner join t_customer c on trn.customer_no = c.customer_no
Where trn.id = @Cat and
trn.customer_no not in (select customer_no from tx_const_cust where Constituency = @const)
Declare @Customer_No int, @Rank int
Open Cursor1
Fetch Next From Cursor1 into @Customer_no, @Rank
While @@Fetch_Status = 0
Begin
Insert into TX_CONST_CUST
Values(@Const, @Customer_No, 'N', @Rank, null, 'ConUpdt', getdate(), null, null)
If @@Error != 0
Begin
Print @Customer_No
Print @@Error
print 'error'
CLOSE Cursor1
DEALLOCATE Cursor1
return 1
End
FETCH NEXT FROM Cursor1 into @Customer_no, @Rank
END
CLOSE Cursor1
DEALLOCATE Cursor1
**** Method 2 was a straight insert from a select statment:
/*insert method */
Declare @Const int, @Cat int
Select @Const = 4
Select @Cat = 2
Insert into TX_CONST_CUST
Select distinct
@Const const, --Constituency
trn.customer_no,
'N', --Default_Ind
(select Rank from tr_constituency where id = @Const) as Rank,
null,
'ConUpdt',
getdate(),
null,
null
From ##tblTRN trn inner join t_customer c on trn.customer_no = c.customer_no
Where trn.id = @Cat and
trn.customer_no not in (select customer_no from tx_const_cust where id = @const)
Print 'Complete'
Chris.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

sjcsystems Ten Centuries Points: 1347 More actions · Answer 1

I'm not going to answer your question directly, but I will say a few things. Firstly, you're use of a NOT IN will not take advantage of any indexes because NOT's are not optimisable, so that's a bottleneck in both cases. You should fully qualify all objects, in this case table names, to speed things up. SELECT INTO's are always resource intensive because of the huge amounts of locks they create. What sort of number of rows are you processing with this code?

Regards

Simon

sjcsystems Ten Centuries Points: 1347 More actions · Answer 2

Actually answered my own question (868 rows), sorry. Why are you using FAST FORWARD and not READ ONLY as the cursor type?

Chris_P SSCommitted Points: 1753 More actions · Answer 3

quote:
Actually answered my own question (868 rows), sorry. Why are you using FAST FORWARD and not READ ONLY as the cursor type?

According to Books Online a Fast_Forward cursor is Read_only by default in addition to not updatable. Since I'm not updating any data in the cursor and only moving forward in within it I thought it would be faster.

What do you normally do instead of a "not in?" What alternatives are there?

Chris.

NPeeters SSChampion Points: 12541 More actions · Answer 4

In your NOT IN subquery, you are comparing the values against the table you are inserting in.

The cursor (recordset) is built once, and you loop through the records. In the direct statement, I think this query will be run again, with a different content of the table because you are inserting stuff.

Not sure about the effect this has on performance.

Antares686 SSC Guru Points: 125444 More actions · Answer 5

**** Method 2 updated, try this.

/* insert method */

Declare @Const int, @Cat int

Select @Const = 4

Select @Cat = 2

Insert into

TX_CONST_CUST

(use column list here of only columsn to actually insert, this will makes nulls unneeded in select)

Select distinct

@Const const, --Constituency

trn.customer_no,

'N', --Default_Ind

(select Rank from tr_constituency where id = @Const) as Rank,

'ConUpdt',

getdate()

From

##tblTRN trn

inner join

t_customer c

on

trn.customer_no = c.customer_no

left join

tx_const_cust tx

ON

ts.customer_no = trn.customer_no and

tx.[id] = @const

Where

trn.[id] = @Cat and

tx.customer_no IS NULL

One note, id is a keyword and will parse better if you wrap in []. Also, I did not understand the logic of ##tblTrn, nor did I understand tr_constituency and it's id field as related to the rest of the data. May make better since to join it in instead of subquery for best performance. The left join instead of the subquery will keep all rows from the previous joins and match up only where there is a match in the other table, then when done the NULL values from the other table will be the items that are NOT IN it.

danschl SSCertifiable Points: 5770 More actions · Answer 6

not in will cause alot of the time issues.

Use a left outer join on an indexed field will work better.

lambje SSCrazy Points: 2085 More actions · Answer 7

lambje

SSCrazy

Points: 2085

January 16, 2003 at 8:43 am

#446516

or a NOT EXISTS

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 8

Under normal circumstances, a set based statement (such as the method two) will perform better than a cursor. I believe that the answers provided by all the above give the reason for the insert being slower. With only a single row involved (in the cursor), the inefficiencies noted by our peers above, are not enough to hurt badly. Multiply this exponentially with the recordset in the set based, and you get the extended time. I would say that if the set based statement was corrected according to the prior suggestions, it would perform much better than the cursor. Just a FYI, declaring a cursor generally takes between 8 and 12 transactions on the server. Then, for each fetch, 5 to 7 or more transactions are required. Taking into account the 800 or so records mentioned, and doing the math, you can see that internal to the SQL server, the cursor will create thousands of transactions to accomplish what the set based statement will perform in no more than a double handfull. You can demonstrate this by profiling while running both methods. The locks generated by thousands of transactions would be significant, though short lived, but SQL wouldn't have the option of escalating to page or table locks. If you consider that each lock is 96 bytes, you can easily see that the memory used by the cursor method is quite a large amount, not to mention the overhead of keeping track of said locks, acquiring them and releasing them, etc... In a low transaction system, this may not cause any issues, but there comes a break point where such behavior can eat up resources like candy. Even with a Read Only or fast forward cursor, the inserts will generate locks. If the inserts are being done on a clustered indexed table, they will start as Shared update locks (pretty lightweight) until the server locates the insertion point, and escalate to exclusive locks for the actual data modification. When you compare this to the locks generated by the set based method, and the fact that SQL will prevent the large amount generated by escalating them to page or maybe even table locks, its very appearant which direction to take. So, the answer to your question, in my opinion, is to use a cursor when you simply cannot do it set based, or you have a need to break up a long running transaction into smaller chunks for concurrency, and in the second case, I would still attempt to find a method other than a cursor. Thinking that resources are not an issue, because you have plenty today, is not a good approach, as in a year, it most likely will not be the case. Approach every issue with the idea that every resource is solid gold, and in a year or so, you'll still have your hair, and be a hero to your company.

Antares686 SSC Guru Points: 125444 More actions · Answer 9

quote:
Approach every issue with the idea that every resource is solid gold, and in a year or so, you'll still have your hair, and be a hero to your company.

Sorry, can't help myself on this one. But, by this statement I must be really good at what I do as my hairs is more than a foot long, no one argues when I say their code sucks, and when I say it will fix a problem they do generally do what I suggest.

Scorpion_66 SSCertifiable Points: 7891 More actions · Answer 10

LMAO....gotta love it.....

Based on what I've seen in the forums, you do WAY better than most, in my opinion, and I simply couldn't picture you having any problems anyway.

Had to cut my hair since the drop in the economy. One of the downfalls of being contract mostly.

Edited by - scorpion_66 on 01/17/2003 07:09:40 AM

Chris_P SSCommitted Points: 1753 More actions · Answer 11

Chris_P

SSCommitted

Points: 1753

January 17, 2003 at 9:45 am

#446648

Thanks for your help guys.

Chris.