Your Recommendations on Query / Cursor in case?

Question

Your Recommendations on Query / Cursor in case?

ngkh

Mr or Mrs. 500

Points: 531
More actions
March 6, 2003 at 8:06 am

#79778

I have asked to do a comparsion various columns in 2 tables with the same table structure, for example
Table Person1 with the following structure
FirstName
LastName
PhoneNumber
Address
City
PostalCode
Birthdate
PersonID
and Table Person2 with the exact same structure
In this case should cusor be the appropiate choice or would writing queries comparing different fields is better in terms of performance. # of rows will be close to 900 thousand records per table

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Cheddar SSC Eights! Points: 886 More actions · Answer 1

Out of personal preference I try not to EVER write cursors unless they are absolutely necessary. I read somewhere about how SQL is a set based language and you need to think of everything in resultsets and never a record at a time. Don't get me wrong sometimes a cursor is the only way to accomplish what you want to do but in your case I would compare the two tables column by column.

guarddata SSCertifiable Points: 6583 More actions · Answer 2

I agree with cheddar. Even with the great performance improvements made for cursors, this looks like one that is much better in a set-based operation.

The exception that comes to mind (and still might not be a good condition) is if you are writing a generic utility and don't know the table structure beforehand.

My 2 cents.

Guarddata-

Steve Jones - SSC Editor SSC Guru Points: 741290 More actions · Answer 3

3rd vote for a set based query.

Steve Jones

sjones@sqlservercentral.com

http://www.sqlservercentral.com/columnists/sjones

http://www.dkranch.net

Greg Larsen SSC-Insane Points: 20961 More actions · Answer 4

Since you have 900 Thousand records that would be a lot of memory thrashing to process a cursor based query. Should be faster if you can write a set based query.

Gregory Larsen, DBA

If you looking for SQL Server Examples check out my website at http://www.geocities.com/sqlserverexamples

Gregory A. Larsen, MVP

Greg Larsen SSC-Insane Points: 20961 More actions · Answer 5

Since you have 900 Thousand records that would be a lot of memory thrashing to process a cursor based query. Should be faster if you can write a set based query.

Gregory Larsen, DBA

If you looking for SQL Server Examples check out my website at http://www.geocities.com/sqlserverexamples

Gregory A. Larsen, MVP

mromm SSCarpal Tunnel Points: 4301 More actions · Answer 6

I think you can do it without cursor in one table scan like this (only replace NULL values with, say, '' using ISNULL function):

SELECT PersonId

, CASE t1.FirstName WHEN t2.FirstName THEN 0 ELSE 1 END AS FirstNameDifferent

, CASE t1.LastName WHEN t2.LastName THEN 0 ELSE 1 END AS LastNameDifferent

.............

FROM t1

JOIN t2 ON t1.PersonID = t2.PersonID

Edited by - mromm on 03/06/2003 11:58:57 AM

Edited by - mromm on 03/06/2003 12:00:01 PM

Crispin Proctor SSCoach Points: 18161 More actions · Answer 7

You can normally do away with cursors with temp tables. Populate a temp table (Either variable or normal) and loop through it. I have used this many times with large records sets and not had any problems.

Could you not solve the problem by doinig a normal select with join?

Select * From table1 JOIN table 2 on T1.PersonID = T2.PersonID

Where [All the fields do not match]

That would be instant results and only return mismatched data.

Cheers,

Crispin

Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

Jeremy Kemp SSCertifiable Points: 6754 More actions · Answer 8

Crispin,

Forgive my ignorance, but how do you loop through a table if you don't use a cursor?

Jeremy

Crispin Proctor SSCoach Points: 18161 More actions · Answer 9

You can do it one of two ways. Noth require a temp table.

I prefer using a table variable instead of a normal table in TempDB

-----------------------------------------------

SET NOCOUNT ON

--Declare you temp table (Can use a normal temp table.)

Declare

@vQuotesTABLE(

IdentINT NOT NULL IDENTITY(1, 1),

QuoteIDINT NOT NULL,

QuoteRefVarchar(20) NOT NULL

)

--Declare Local Variables.

--@vCount is number of rows in temp table

--@vQuoteID is the column I want to work with

--@vQuoteRef is the column I want to work with

Declare

@vCountINT,

@vQuoteIDINT,

@vQuoteRefVARCHAR(20)

-- Insert into the table

Insert Into @vQuotes

Select QuoteID, QuoteRef From Quotes

--Set the @#vCount with row count. Can use Set @vCount = MAX(Ident) From @vQuotes

Select @vCount = @@Identity

WHILE @vCount > 0

Begin

--Loop through the table pulling one row at a time into a variable.

Select

@vQuoteID = QuoteID,

@vQuoteRef = QuoteRef

From @vQuotes Where Ident = @vCount

-- DO what you want here. Can exec another proc passing individual variables.

Print STR(@vQuoteID) + ' ' + @vQuoteRef

-- Dec your count

Set @vCount = @vCount - 1

End

-- Go and have a smoke.

----------------------------------

If you want to use a "normal" temp table for this, create it as usual and use #table name instead of @vTableName.

I have had no problems with this method. Preferr it over cursors. Prefer using variable tables over normal because of less disk IO and you don't have to clean up the table.

People have warned about using them with large record sets due to high memory consumption. I have yet to have a problem. 10000 rows in it and no problem.

Cheers,

Crispin

Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

Crispin Proctor SSCoach Points: 18161 More actions · Answer 10

One more hint though. This can cause problems with variable tables.

SQL does not have statistics on the, You cannot add an index to it.

If you gonna have large tables where an index would be beneficial, rather use a normal temp table.

Cheers,

Crispin

Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

Jeremy Kemp SSCertifiable Points: 6754 More actions · Answer 11

Thanks for the info.

I have a csv feed from another system and I need to do inserts/updates in my system. Some of these feed files are large (can be several million rows) so what I do is this:

- load into table in tempdb though bulk insert

- declare cursor

- fetch row from cursor

- determine if row exists in table

- insert/update depending on whether exists already

- fetch next

- tidy up

In ideal conditions (only one job running and nothing else runnning on server) I can process upto 2 million rows per hour. I often run multiple updates concurrently (anything up to 10) and can get a total of 5 or 6 million rows processed per hour - I guess I start running into I/O constraints rather than anything else which is why I don't get a straight multiple of the max throughput.

I'll give it a go sometime and see if I get a performance boost. There will be the extra overhead of creating the index on the identity column which I don't have at the moment but I might save time by not having the overhead of a cursor.

Thanks

Jeremy

Crispin Proctor SSCoach Points: 18161 More actions · Answer 12

Maybe I misunderstand you but -

Solution 1)

Dump the contents of the csv file into a temp table.

You not gonna get around that.

Could you not do something alone these lines.

Insert Into MainTable Where Somedata Not In (Select somedata from TempTable)

This will do a "diffential" insert.

No comparisons for SQL to do.

I have a couple of imports that work like this. Much faster.

Solution2)

Haven't actually tried it but should work.

When you import data, you can specify a column copy OR query.

Could you not select the columns from the CSV file where data not in main table.

This might be extremly slow as SQL cannot do an efficiant search on a text file (Not 100% sure of this)

This, assuming the select works, would be even faster as SQL will only be pulling relavant data from the file.

Let us know what happens. Would be interesting.

Crispin

Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!