November 17, 2013 at 8:58 am
I am having a cursor which I use to update a 2nd table containing unique IDs from an identity column.
The cursor typically returns 100-300 records and the code is looping through the cursor to update the column.
The code is similar to the below.
This has been working fine for over 2 years. Now suddenly, while my cursor still only returns about 150 rows, the loop runs indefinitely, i.e. it adds tens of thousands of records into the "lpn" table, and it does update the "addresses" table. When it reaches the last record of the cursor, it starts again at the beginning basically, updating the record in the "addresses" table which corresponds to the first record in my cursor and so on.
So the loop never ends, and it just keeps updating the "addresses" table.
I can of course add a statement with an integer increment within the WHILE loop counting how many records it has inserted and check it against the total number of records in my cursor. If I do that, it properly ends.
But the code has been working for years. Why would it suddenly stop working? What could cause this?
I know that concurrent fetch operations can mess things up with the @@fetch_status variable, but
there are no queries and no other stored procedures running on that instance which do any sort of fetch.
declare @newLPN int;
declare @currentID int;
set @newLPN = 0;
set @currentID = 0;
DECLARE cursorlpn CURSOR FOR
select AccountID from Addresses
where country = 'HR';
fetch next from cursorlpn into @currentID
while (@@fetch_status = 0)
begin
insert into lpn (deliveryID) values (0);
set @newLPN = scope_identity();
update Addresses set lpn = @newLPN where AccountID = @currentID;
fetch next from cursorlpn into @currentID
end
November 17, 2013 at 9:07 am
The best solution would be to get rid of c.u.r.s.o.r. *cough* in the first place.
Insert all data into lpn at once and use the OUTPUT clause to get the returned identity values. Then perform a singel update to the Addresses table for all AccountID's at once.
November 17, 2013 at 9:54 am
Well, I guess you're right, and I agree. I inherited the code and yes, it won't stay this way.
But my question is rather why does something like this occur - it did work for 2 years and it's perfectly valid T-SQL, so why does SQL Server suddenly decide it has to go into a spin turn-around?
It was consistent and reproducable, I tried it with various cursor filters, about 20 times.
November 17, 2013 at 9:57 am
Peter Schulz-485500 (11/17/2013)
Well, I guess you're right, and I agree. I inherited the code and yes, it won't stay this way.But my question is rather why does something like this occur - it did work for 2 years and it's perfectly valid T-SQL, so why does SQL Server suddenly decide it has to go into a spin turn-around?
It was consistent and reproducable, I tried it with various cursor filters, about 20 times.
Based on just the code for the cursor, I have no idea as it looks correct. Since we don't have access to your system and the data there really isn't much we can tell you. A guess would be something changed recently. Could be the structure of the data, could be something in the data.
November 17, 2013 at 10:14 am
Can you provide a test scenarion so we can reproduce it on our machines?
I'm confident the c.u.r.s.o.r. *cough* doesn't restart "all of a sudden, all by itself".
My guess would be there's a code change "outside" that fires the code where the loop is being called. Maybe a trigger added to the Addresses table or something like that.
If you run a profiler trace, do you spot anything unexpected?
November 17, 2013 at 12:33 pm
Generally speaking, in most programming languages it is a bad idea to "loop" through a data set that you are updating.
I could imagine that in this case you could have hit a situation where those updates are affecting the underlying data in some way that is messing with the logic of the loop, which would be avoided either by working with a temporary copy of the list of Accounts you are going to update or by using an INSENSITIVE cursor.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 17, 2013 at 2:34 pm
Here is some test code.
I am afraid you won't find any problem on the test code, as I could not do so myself. It occurs only in my production database.
Seems that the general consent is that there couldn't be anything wrong really in SQL Server itself? Has no-one every seen a cursor behaving this way?
If I can rule this out as a possibility, then well, alright, I'll have to dig into the rest of this application. I just want to know if this is a possibility.
The application I am having here is a mix of Visual Basic with some direct SQL queries and a handful of stored procedures. The cursor I am talking about is in a stored procedure. This isn't so pretty obviously.
I am not sure if Visual Basic isn't doing cursor fetching behind the scenes so this could mess up my cursor possibly also.
-- test database and tables --
-- create new test data base
create database tmp4051
go
--
use tmp4051
go
IF OBJECT_ID('dbo.Addresses','U') IS NOT NULL
DROP TABLE Addresses
IF OBJECT_ID('dbo.lpn','U') IS NOT NULL
DROP TABLE lpn
create table Addresses (AccountID int PRIMARY KEY,
Name1 varchar(50),
Name2 varchar(50),
Address1 varchar(50),
country char(2),
lpn int NULL)
create table lpn (deliveryID int,
lpn int IDENTITY(1,1))
-- test data --
insert into Addresses (AccountID, Name1, Name2, Address1, country)
select 1, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 2, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 3, 'Blow', 'Joe', 'Smithson Street 3540', 'D' union all
select 4, 'Blow', 'Joe', 'Smithson Street 3540', 'D' union all
select 5, 'Blow', 'Joe', 'Smithson Street 3540', 'D' union all
select 6, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 7, 'Blow', 'Joe', 'Smithson Street 3540', 'F' union all
select 8, 'Blow', 'Joe', 'Smithson Street 3540', 'F' union all
select 9, 'Blow', 'Joe', 'Smithson Street 3540', 'F' union all
select 10, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 11, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 12, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 13, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 14, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 15, 'Blow', 'Joe', 'Smithson Street 3540', 'A' union all
select 16, 'Blow', 'Joe', 'Smithson Street 3540', 'A' union all
select 17, 'Blow', 'Joe', 'Smithson Street 3540', 'US' union all
select 18, 'Blow', 'Joe', 'Smithson Street 3540', 'HR' union all
select 19, 'Blow', 'Joe', 'Smithson Street 3540', 'GB' union all
select 20, 'Blow', 'Joe', 'Smithson Street 3540', 'GB' union all
select 21, 'Blow', 'Joe', 'Smithson Street 3540', 'GB'
-- the offending code is here: ---------------
declare @newLPN int;
declare @currentID int;
set @newLPN = 0;
set @currentID = 0;
DECLARE cursorlpn CURSOR FOR
select AccountID from Addresses
where country = 'HR';
open cursorlpn
fetch next from cursorlpn into @currentID
while (@@fetch_status = 0)
begin
insert into lpn (deliveryID) values (0);
set @newLPN = scope_identity();
update Addresses set lpn = @newLPN where AccountID = @currentID;
fetch next from cursorlpn into @currentID
end
close cursorlpn
deallocate cursorlpn
November 17, 2013 at 3:34 pm
I can't reproduce the scenario you described.
Based on your sample data the code runs as expected (I'm purposely not saying it runs "fine" 😉 ).
If you already tried several filters (including LOCAL STATIC FORWARD_ONLY) and you still don't want to rewrite it as a set-based solution, you could try to narrow-down the root-cause by assigning a NEWID() to the outer sproc and include that column into the lpn table. Therewith you'd be able to verify that the sproc itself is going crazy or if it's called multiple times.
My question reagrding the process in general:
Why isn't there any WHERE condition to ensure that only rows without an already assigned lpn value will be selected?
What I've seen when runnig your code: If you call it twice, then it will update all rows with country = 'HR' even though those rows already have an lpn value assigned. Is that intentionally?
November 17, 2013 at 3:59 pm
Based on a "gut" feeling and previous experience with "run away updates", try replacing the UPDATE statement in your code with the following and let us know what happens.
UPDATE addr
SET lpn = @newLPN
FROM dbo.Addresses addr
WHERE AccountID = @currentID
;
--Jeff Moden
Change is inevitable... Change for the better is not.
November 17, 2013 at 4:13 pm
Wait a minute, Jeff!
You suggest to "simply" add a FROM clause to the update statement? (The alias isn't mandatory, is it?)
Interesting, what kind of "previous experience" you can refer to... (I'm not sure if I'm supposed to feel glad not to have faced such a situation. Yet!?! ;-))
November 17, 2013 at 5:32 pm
LutzM (11/17/2013)
Wait a minute, Jeff!You suggest to "simply" add a FROM clause to the update statement? (The alias isn't mandatory, is it?)
Interesting, what kind of "previous experience" you can refer to... (I'm not sure if I'm supposed to feel glad not to have faced such a situation. Yet!?! ;-))
Oddly enough, yes... that's pretty much what I'm suggesting. And update the alias like I did instead of the table. It was an undocumented trick in older versions of SQL Server long before the documentation in Books Online caught up with it and it was also the easiest way to do a self-joined update using two aliases for the same table in the FROM clause.
The reason I'm suggesting it is because we discovered, quite by accident, that it can make substantial improvements in performance by giving the optimizer better direction through code. Normally this problem has only appeared on improperly formed JOINed updates where the target table wasn't included in the FROM clause but was included in the WHERE clause. And when I say "appeared", I mean appeared in Spades. My first experience (more years ago) with the problem caused a normally 20 second update to pin 4 CPUs to the proverbial wall for two hours!
A long time ago, certain folks took exception to me blaming this type of problem on "Halloweening" because MS supposedly has code behind the UPDATE statement to prevent it, but it sure does have all the ear marks of "Halloweening" especially since this particular problem in generating thousands of unwanted rows.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 17, 2013 at 6:03 pm
Lutz - I am going to rewrite it set-based, sure.
As I see it, cursors are really meant for interaction between SQL Server
and applications which just cannot understand set-based operations.
Not within a stored procedure.
As I said, I didn't write it, I am having to dig into it as it suddenly blocked
the application.
To your other points - the sproc is not being called multiple times, I have
verified this.
I did execute it directly from Mgmt Studio and it did the same - it never returns for
minutes, meanwhile adding hundreds of thousands of records to the lpn table.
I already created a new procedure with the same code, just to be sure
that the sproc object isn't corrupt in some way. Didn't change a thing.
Good idea to put a WHERE condition.
There SHOULD not be a point trying to filter the records - based on what
the app is doing with it. Yes, if you run this twice, it will update these
records again. (The 'HR' is my rewrite, it is a parameter of the sproc
really). This is "allowable" in the application because t it won't ever
(presumably) call the sproc with the same parameter twice.
Jeff - yes I can put the FROM Addresses in the query, but it doesn't
change the behaviour. It also should not.
I'll solve the problem by rewriting. It still bugs me why on earth this
seemingly innocent code just overnight decided to mess up?
November 17, 2013 at 7:16 pm
Are you SURE nothing changed in the code?
The likely culprit looks to be here:
DECLARE cursorlpn CURSOR FOR
select AccountID from Addresses
where country = 'HR';
I'm making a giant assumption, but how many addresses have the same accountID? Not sure what kind of account this might be- but the accounts where I work have many addresses. Assuming yours did too, then the cursor would "loop" for each duplicated accountID in your addresses table.
To check for duplicates:
select accountID, count(*) from addresses where
country = 'HR'
group by accountid
having count(*)>1
Edit: fixed the missing GROUP BY clause.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 17, 2013 at 9:22 pm
Peter Schulz-485500 (11/17/2013)
Jeff - yes I can put the FROM Addresses in the query, but it doesn'tchange the behaviour. It also should not.
I'll solve the problem by rewriting. It still bugs me why on earth this
seemingly innocent code just overnight decided to mess up?
Thanks for the feedback, Peter. It was a shot in the dark from successes for slightly similar problems where such things go crazy on their own.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 18, 2013 at 10:54 am
Is there any way you can provide a setup that could be used to reproduce the odd behavior of the sproc on a standalone machine (meaning that we can test against)?
There has to be a reason for the repetitive call.
And I agree: it's always good to know WHY it occured instead of just making it disappear by "just" rewriting the code...
Viewing 15 posts - 1 through 15 (of 40 total)
You must be logged in to reply to this topic. Login to reply