September 8, 2004 at 11:10 am
Hi All,
This time I need few clarifications and suggestions about using Cursors.
Me working in a Finance Domain for an ERP called Great Plains and most of our work will be in SQL.
Very often we need to write various stored procedures especailly for:
1) extracting data from different datasources (mainly tables and views) to show in the reports by implementing the required business logic. Especially we need to convert most of the values based on multiple currencies, units, etc. Most of our reports are complex with multiple sub reports according to the client requirements.
2) for integrating data from different external sources into our ERP tables by following all necessary validations for the data consistency.
3) For validating different kinds of process routines (Especially posting routines) and also writes for other scenarios also
In all the above mentioned kind of scenarios we are using Cursors (Read only) and also feel comfortable to implement our logic with them.
But for improving the performance issues, how can we eliminate these cursors and what other alternatives can be used to impose our logic easily and comfortably.
Can we completely eliminate these cursors with any other alternatives,
Please suggest some better solutions for improving the performance.
Thanks in advance,
Subhash
September 8, 2004 at 3:51 pm
Dear Subhash:
In my case, I avoid cursors like they are a curse...
I have tables with tens of millions of rows (daily security prices in multiple currencies) and row-by-row processing is out of the question.
Please correct me if I'm wrong but I guess your cursors are being used in data validation (column by column ) or in reports where you do some column formatting row-by-row.
My suggestion is for you to look at what your procedure or job is doing as a whole and try to group together similar occurences with the same processing consequences.
For example, I use SELECT CASE a lot to do a single pass process...and combine it with nested subqueries...
UPDATE MARKET_TABLE
SET fxrate = IsNull((SELECT f.rate /CASE f.basis
WHEN 'L' then 100000
WHEN 'K' then 10000
WHEN 'N' then 1000
WHEN 'Y' THEN 100
ELSE 1
END
FROM forexchange f
WHERE f.currency_code=MARKET_TABLE.currency_code
AND f.fx_date=(SELECT MAX(f1.fx_date)
FROM forexchange f1
WHERE f1.currency_code=f.currency_code AND
f1.fx_date<= MARKET_TABLE.balancedt)),1)
WHERE currency_code 'USD'
In order to execute a repetitive process, I use WHILE LOOPS
with conditions that can be shared by multiple rows.
When validating certain occurences of specific data on certain columns, I use TEMP TABLES to hold the specific data and use equijoins with my base table.
I replaced a cursor within a cursor situation in one developer procedure by using a join with a derived table:
SELECT BR.broker_id, BR.broker_name,SA.sec_count
FROM BROKER AS BR,
(SELECT broker_id, COUNT( security_id) AS sec_count
FROM sales
GROUP BY broker_id
) AS SA
WHERE BR.broker_id = SA.broker_id
The original had a cursor on the sales table within a loop on the cursor of the broker table.
I hope this helps a bit...
---melvynpatrick
September 9, 2004 at 12:08 am
Hi Subhash,
Great Plains huh? I've worked with them since version 3.15 and most of their legacy code is still written like they are using Btrieve. All of their examples use cursors. Yuck.
To give you an example of the performance benefit from NOT using cursors, run this query against the TWO company database.
The first output will give your the begin and end time using a cursor and the second output will give you the begin and end time using set based queries. The awesome difference is that the second query (Set based actually went through 41 million records, while the first query (Cursor based) had to be limited to 1 million in order to prevent taking hours to run.
Once you go set, you never go back. I've not knowingly used a cursor in 5 years. Sure, Dexterity uses cursors in the background, but I didn't knowingly use one.
Part of the problem with getting away from cursors is that our brains can wrap around the 1 record, one update concept. The whole concept of grabbing 3 tables and then comparing the resultant dataset against 2 other tables as a whole just blows our minds. I still can't truly grasp why it works, it still makes me kind of starry eyed.
Here's the code I promised:
A quick rundown on what I'm doing. I'm using a cross join on the GL20000 table to generate enough records to make it really obvious which works best. I have to clear the buffers so this test can be run multiple times without the benefit of SQL Server caching the data and skewing the results. You shouldn't see more than a few milliseconds difference as long as you aren't doing anything else on the computer while it's running.
I limit the Cursor test to 1,000,000 rows(if you don't want to wait 4 minutes, limit it to 100,000 rows) because it takes 4 minutes to run 1 Million, it would take almost 3 hours to run 40,000,000 rows. I don't have the same problem with the Set based script. It takes 330 ms to run 1000 records and 660ms to run 41,000,000! The incredible speed of Set based operations strikes again. I'm never underwhelmed by the difference.
--START CODE
USE TWO
DBCC DROPCLEANBUFFERS
go
DECLARE @Reference char(30)
DECLARE @NameCountA int
DECLARE @NameCountB int
DECLARE @NameCountC int
DECLARE @RecordCount int
DECLARE @BeginTime datetime
DECLARE @EndTime datetime
DECLARE @CursorTime int
DECLARE @SETTIme int
SET @BeginTime = getdate()
PRINT convert(varchar(50),@BeginTime,9)
SET @NameCountA = 0
SET @NameCountB = 0
SET @NameCountC = 0
SET @RecordCount = 0
DECLARE test_Reference cursor
FOR
SELECT TOP 1000000 GL20000.REFRENCE
FROM GL20000,GL20000 as CrossJoined
OPEN test_Reference
FETCH NEXT FROM test_Reference
INTO @Reference
WHILE @@FETCH_STATUS = 0
BEGIN
SET @RecordCount = @RecordCount + 1
IF LEFT(@Reference,1) = 'A'
SET @NameCountA= @NameCountA + 1
IF LEFT(@Reference,1) = 'B'
SET @NameCountB= @NameCountB + 1
IF LEFT(@Reference,1) = 'C'
SET @NameCountC = @NameCountC + 1
FETCH NEXT FROM test_Reference
INTO @Reference
END
CLOSE test_Reference
DEALLOCATE test_Reference
PRINT 'Number of Reference columns beginning with A = ' + cast(@NameCountA as varchar(16))
PRINT 'Number of Reference columns beginning with B = ' + cast(@NameCountB as varchar(16))
PRINT 'Number of Reference columns beginning with C = ' + cast(@NameCountC as varchar(16))
PRINT 'Number of Total Records ' + cast(@RecordCount as varchar(16))
SET @EndTime = getdate()
PRINT convert(varchar(50), @EndTime,9)
SET @CursorTime = DATEDIFF(ms,@BeginTime,@EndTime)
USE TWO
DBCC DROPCLEANBUFFERS
SET @BeginTime = getdate()
PRINT convert(varchar(50), @BeginTime,9)
SET @NameCountA = 0
SET @NameCountB = 0
SET @NameCountC = 0
SET @RecordCount = 0
SELECT @NameCountA = sum(CASE LEFT(GL20000.REFRENCE,1)
WHEN 'A' THEN 1
ELSE 0
END),
@NameCountB = sum(CASE LEFT(GL20000.REFRENCE,1)
WHEN 'B' THEN 1
ELSE 0
END),
@NameCountC = sum(CASE LEFT(GL20000.REFRENCE,1)
WHEN 'C' THEN 1
ELSE 0
END),
@RecordCount = count(GL20000.REFRENCE)
FROM GL20000,GL20000 as CrossJoined
/*SELECT @NameCountA = COUNT(REFRENCE)
FROM GL20000
WHERE left(REFRENCE,1) = 'A'
SELECT @NameCountB = COUNT(REFRENCE)
FROM GL20000
WHERE left(REFRENCE,1) = 'B'
SELECT @NameCountC = COUNT(REFRENCE)
FROM GL20000
WHERE left(REFRENCE,1) = 'C'
SELECT @RecordCount = count(REFRENCE)
FROM GL20000
*/
PRINT 'Number of Reference columns beginning with A = ' + cast(@NameCountA as varchar(16))
PRINT 'Number of Reference columns beginning with B = ' + cast(@NameCountB as varchar(16))
PRINT 'Number of Reference columns beginning with C = ' + cast(@NameCountC as varchar(16))
PRINT 'Number of Total Records ' + cast(@RecordCount as varchar(16))
SET @EndTime = getdate()
PRINT convert(varchar(50), @EndTime,9)
SET @SetTime = DATEDIFF(ms,@BeginTime,@EndTime)
PRINT 'Number of milliseconds to run using cursor: ' + cast(@CursorTime as varchar(16))
PRINT 'Number of milliseconds to run using Set: ' + cast(@SetTime as varchar(16))
--END CODE
September 9, 2004 at 3:56 am
I use SERVERFILTER in my Access (project) client:
http://www.winnetmag.com/Articles/Print.cfm?ArticleID=9037
All the best,
Dale
Author: An Introduction to SQL Server 2005 Management Studio
September 9, 2004 at 7:27 am
Hi All,
Thanks for all your suggestions.
I will try all of them and get back to you.
Subhash
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply