May 11, 2010 at 9:53 am
Would it be possible for the following to generate a lower RowID for a higher ID? For example:
RowID ID
1 1000
2 800
create table #TempAllIDs(RowID int identity, ID numeric(16,0))
insert into #TempAllIDs(ID) select ID from Table1 (nolock) order by ID
I know we've seen duplicate identity values on another query without an order by clause due to parallelism and I'm wondering if it can cause the order by to not operate as expected. The server I'm looking into now has 8 processors and the max degree of parallelism set to 0. There is an existing clustered, primary key on Table1.ID and right before the temp table is created we're creating a nonclustered index on Table1.ID for some reason I haven't looked into yet.
For those that want the full story we ended up with unexpected values after an update statement. #TempMapTable1Status is populated by the old and new status tables. It happens on most servers this is run on that the ID for the old "Complete" status corresponds to the ID for the new "Error"status but we've only noticed this on two servers. It appears that the status for the completed entries didn't get updated. However, there were five other statuses that completed fine including the old "Error" status to the new "Error" status.
create nonclustered index tempNDXUpgradeID on Table1(ID)
create table #TempAllIDs(RowID int identity, ID numeric(16,0))
insert into #TempAllIDs(ID) select ID from Table1 (nolock) order by ID
create index #ndx1 on #TempAllIDs(RowID, ID)
print 'Created temporary indexes'
select @NumTotalRows = count(*) from #TempAllIDs (nolock)
select @StartRow = 1
select @EndRow = 50000 --@StartRow + 50000
while @StillLoop = 'Y'
begin
If exists (select ID from #TempAllIDs (nolock) where RowID = @StartRow)
select @StartID = ID from #TempAllIDs (nolock) where RowID = @StartRow
Else
Select @StartID = max(ID) + 1 from #TempAllIDs
If exists (select ID from #TempAllIDs (nolock) where RowID = @EndRow)
select @EndID = ID from #TempAllIDs (nolock) where RowID = @EndRow
Else
Select @EndID = max(ID) + 1 from #TempAllIDs
begin transaction
update Table1
set Table1StatusDE = (select NewTable1StatusDE from #TempMapTable1Status (nolock) where Table1StatusDE = CurrentTable1StatusDE)
where ID between @StartID and @EndID
and Table1StatusDE in (select CurrentTable1StatusDE from #TempMapTable1Status (nolock))
if @@error = 0
commit transaction
else
goto ErrorOccurred
print 'Updated the Table1 table'
select @StartRow = @StartRow + 50000
select @EndRow = @EndRow + 50000
if @StartRow > @NumTotalRows
select @StillLoop = 'N'
end
We already know how to guarantee order between RowID and ID but don't want to just do that and go on our way until this happens again in case that's not the cause. If there is something else causing it we need to know and correct it.
And since I've noticed a lot of people insisting on test data off the bat, the data I'm working with needs to stay confidential so it's not going to be likely to be worth the effort of cleaning it for posting unless it's absolutely necessary.
May 11, 2010 at 11:15 am
cfradenburg (5/11/2010)
Would it be possible for the following to generate a lower RowID for a higher ID? For example:RowID ID
1 1000
2 800
create table #TempAllIDs(RowID int identity, ID numeric(16,0))
insert into #TempAllIDs(ID) select ID from Table1 (nolock) order by ID
I know we've seen duplicate identity values on another query without an order by clause due to parallelism and I'm wondering if it can cause the order by to not operate as expected. <snip>
Did you know that using NOLOCK can return duplicate rows? That could be a source of the issue pertaining to duplicate entries.
May 11, 2010 at 11:21 am
Duplicate rows if something else is working in the table? Yep but there's no way that's the case where we saw the duplicate entries. It was a development environment dedicated to one project that very few people were working on and the guy that saw the dupes was seeing them consistently until he set max dop on the statement to 1.
In the case we're seeing that I think may be out of order identity entries was during an upgrade of our product and the upgrade techs lock everyone out of the system and the upgrade scripts are run sequentially.
May 11, 2010 at 12:30 pm
I wouldn't be too sure that the NOLOCK was not the cause of an issue. It is a fallacy that NOLOCK only gets bad data when there are multiple users and/or uncommitted data. Additionally, even READ COMMITTED can get bad data due to data movement.
Here are some links about the effect of using NOLOCK:
http://www.sqlmag.com/Articles/Index.cfm?ArticleID=92888&DisplayTab=Article
http://sqlblog.com/blogs/andrew_kelly/archive/2009/04/10/how-dirty-are-your-reads.aspx
May 11, 2010 at 12:47 pm
Thanks for the links. I haven't had a chance to finish them but I do see your concerns and can start to piece together how that might have caused this. Due to the nature of this script removing the nolock wouldn't have any possible bad side effects so we'll probably be making that change (at least I'll be recommending it.)
Any other thoughts on possible causes?
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply