May 6, 2014 at 8:24 am
venoym (5/6/2014)
Stefan Krzywicki (5/6/2014)
I thought I remembered a discussion here months ago about NOLOCK possibly returning bad data even when there's no changed to the table being queried, but now I can't find it. Am I misremembering?NOLOCK can cause bad data (duplicate and/or missing) even with no changes being made to the table. It is unlikely, but possible. The reasoning mentioned was that SQL can decide to move a page in the middle of your read for Storage Optimization reasons and you would miss that page in the read. Or that you would read the page twice. I don't have the reference, unfortunately, but I think it was Ms. Shaw that pointed that out. I believe that Mr. Moden and maybe Hugo were also in on that discussion.
Awesome, thanks. I was starting to think I'd remembered it all wrong. I read an article about it at the beginning of the year, but haven't been able to find it since. Now I just need to find the reference.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 6, 2014 at 8:24 am
No, it was not me that pointed it out, because for a page split to occur, there must be an insert/update operation occurring. SQL doesn't randomly, when it's bored, move pages around in tables.
If there are no concurrent changes occurring, then page splits cannot occur (they occur when an insert/update doesn't have enough space on the page) and it's page splits which cause the duplicate rows/missing rows under read uncommitted.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 6, 2014 at 8:34 am
GilaMonster (5/6/2014)
No, it was not me that pointed it out, because for a page split to occur, there must be an insert/update operation occurring. SQL doesn't randomly, when it's bored, move pages around in tables.If there are no concurrent changes occurring, then page splits cannot occur (they occur when an insert/update doesn't have enough space on the page) and it's page splits which cause the duplicate rows/missing rows under read uncommitted.
So maybe I did remember it wrong? Maybe it is just that if a split has occurred, NOLOCK can return bad data, even if the split happened awhile ago? From the reading I've been doing if the table is static, there's no point in NOLOCK anyway because locks that it avoids won't be happening?
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 6, 2014 at 8:40 am
Isn't the thing with using NOLOCK on a table that isn't being updated that it effectively becomes a complete no-operation? If the table isn't being updated then no locks will be being taken anyway.
May 6, 2014 at 8:40 am
Stefan Krzywicki (5/6/2014)
Maybe it is just that if a split has occurred, NOLOCK can return bad data, even if the split happened awhile ago?
A scan under read uncommitted can return incorrect information if a split occurs *during* the scan because the scan is in allocation order and the new page can have resulted in rows being moved from in front of the current point of the scan to behind it or vis versa.
Can't occur in any other isolation level, because in any other isolation level allocation order scans are only permitted if a full table lock is held (and hence pages can't be splitting)
From the reading I've been doing if the table is static, there's no point in NOLOCK anyway because locks that it avoids won't be happening?
Pretty much. You avoid the cost of requesting locks, but that shouldn't be a noticeable cost.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 6, 2014 at 8:47 am
GilaMonster (5/6/2014)
Stefan Krzywicki (5/6/2014)
Maybe it is just that if a split has occurred, NOLOCK can return bad data, even if the split happened awhile ago?A scan under read uncommitted can return incorrect information if a split occurs *during* the scan because the scan is in allocation order and the new page can have resulted in rows being moved from in front of the current point of the scan to behind it or vis versa.
Can't occur in any other isolation level, because in any other isolation level allocation order scans are only permitted if a full table lock is held (and hence pages can't be splitting)
From the reading I've been doing if the table is static, there's no point in NOLOCK anyway because locks that it avoids won't be happening?
Pretty much. You avoid the cost of requesting locks, but that shouldn't be a noticeable cost.
I found the article I'd been looking for (never deleting email is helpful sometimes)
It talks about previously committed rows being missed even if they were committed "a long time before" the current NOLOCK query. Now, what "a long time" means, I don't know, but is would seem to be a non-trivial increment.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 6, 2014 at 8:58 am
It's exactly what I explained above. Existing rows in the table (previously committed) being missed because the page those existing rows is on is splitting due to an insert/update occuring.
"Long time" being anything from a ms ago to when the table was created.
It's to distinguish from this anomaly from 'dirty reads' which is when uncommitted data is being read.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 6, 2014 at 8:59 am
Stefan Krzywicki (5/6/2014)
GilaMonster (5/6/2014)
Stefan Krzywicki (5/6/2014)
Maybe it is just that if a split has occurred, NOLOCK can return bad data, even if the split happened awhile ago?A scan under read uncommitted can return incorrect information if a split occurs *during* the scan because the scan is in allocation order and the new page can have resulted in rows being moved from in front of the current point of the scan to behind it or vis versa.
Can't occur in any other isolation level, because in any other isolation level allocation order scans are only permitted if a full table lock is held (and hence pages can't be splitting)
From the reading I've been doing if the table is static, there's no point in NOLOCK anyway because locks that it avoids won't be happening?
Pretty much. You avoid the cost of requesting locks, but that shouldn't be a noticeable cost.
I found the article I'd been looking for (never deleting email is helpful sometimes)
It talks about previously committed rows being missed even if they were committed "a long time before" the current NOLOCK query. Now, what "a long time" means, I don't know, but is would seem to be a non-trivial increment.
Here some articles about NOLOCK that I reference frequently.
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/11/10/1280.aspx
http://blogs.msdn.com/b/davidlean/archive/2009/04/06/sql-server-nolock-hint-other-poor-ideas.aspx
http://www.jasonstrate.com/2012/06/the-side-effect-of-nolock/[/url]
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
May 6, 2014 at 9:03 am
GilaMonster (5/6/2014)
It's exactly what I explained above. Existing rows in the table (previously committed) being missed because the page those existing rows is on is splitting due to an insert/update occuring."Long time" being anything from a ms ago to when the table was created.
It's to distinguish from this anomaly from 'dirty reads' which is when uncommitted data is being read.
Thanks
I'm trying to make sure I understand it well enough to explain it because I'm trying to convince people to stop using it. They always have a ton of reasons why they think it is good or at least doesn't hurt anything. I want to be able to respond to all of these statements with reasons and with links to articles or blogs if possible.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 6, 2014 at 9:04 am
Sean Lange (5/6/2014)
Stefan Krzywicki (5/6/2014)
GilaMonster (5/6/2014)
Stefan Krzywicki (5/6/2014)
Maybe it is just that if a split has occurred, NOLOCK can return bad data, even if the split happened awhile ago?A scan under read uncommitted can return incorrect information if a split occurs *during* the scan because the scan is in allocation order and the new page can have resulted in rows being moved from in front of the current point of the scan to behind it or vis versa.
Can't occur in any other isolation level, because in any other isolation level allocation order scans are only permitted if a full table lock is held (and hence pages can't be splitting)
From the reading I've been doing if the table is static, there's no point in NOLOCK anyway because locks that it avoids won't be happening?
Pretty much. You avoid the cost of requesting locks, but that shouldn't be a noticeable cost.
I found the article I'd been looking for (never deleting email is helpful sometimes)
It talks about previously committed rows being missed even if they were committed "a long time before" the current NOLOCK query. Now, what "a long time" means, I don't know, but is would seem to be a non-trivial increment.
Here some articles about NOLOCK that I reference frequently.
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/11/10/1280.aspx
http://blogs.msdn.com/b/davidlean/archive/2009/04/06/sql-server-nolock-hint-other-poor-ideas.aspx
http://www.jasonstrate.com/2012/06/the-side-effect-of-nolock/[/url]
Thanks, I didn't have some of these and they'll come in handy.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 6, 2014 at 11:49 am
GilaMonster (5/6/2014)
No, it was not me that pointed it out, because for a page split to occur, there must be an insert/update operation occurring. SQL doesn't randomly, when it's bored, move pages around in tables.If there are no concurrent changes occurring, then page splits cannot occur (they occur when an insert/update doesn't have enough space on the page) and it's page splits which cause the duplicate rows/missing rows under read uncommitted.
Thank you for correcting me 🙂 I do appreciate it, I had read something and wasn't 100% sure.
May 6, 2014 at 12:11 pm
Stefan Krzywicki (5/6/2014)
I'm trying to make sure I understand it well enough to explain it because I'm trying to convince people to stop using it. They always have a ton of reasons why they think it is good or at least doesn't hurt anything.
I have no problem at all with people that use the read uncommitted isolation level, so long as it is an informed choice. I do have a serious problem with people that add WITH (NOLOCK) to every table in a query. We would probably see this practice a lot less if the NOLOCK synonym for READUNCOMMITTED did not exist. It is an isolation level hint, not an instruction to take no locks during processing! Backward compatibility has a lot to answer for.
There are five perfectly valid isolation levels in SQL Server, each of which provides a different protections against concurrency effects. READ UNCOMMITTED is the lowest of these, so it should come as no surprise that anomalies are most prevalent there. In addition, SQL Server has some specific behaviours under READUNCOMMITTED that arguably go beyond the spirit of reading uncommitted data. I find the prospect of reading inconsistent parts of a LOB value (like an xml column) and error 601 particularly troubling in this regard.
To repeat what Gail said, a page split may move other rows on the same page, not just the row that caused the split. A concurrent allocation order scan might miss these rows or encounter them multiple times. The unlucky rows are innocent bystanders in this scenario - they might never have changed since they were first added.
That said, not all scans under READUNCOMMITTED are allocation order scans. They have to be expected to be > 64 pages, and the query processor must not have requested an ordered scan (plan property Ordered:True).
Reading rows multiple times and/or missing them is not unique to READUNCOMMITTED, though the page-splitting versus allocation order scan instance of the phenomenon is. We can multi-read or miss rows under locking READCOMMITTED, and miss rows under REPEATABLEREAD. All you have to do is read from an index structure that is experiencing concurrent key modifications.
I'm not sure what to make of the statement that NOLOCK is redundant if the data is not changing. It is true that row-level shared locks might be skipped by the storage engine if it knows for sure the page has not changed, but higher-level intent-share locks are still taken. The main way to ensure no locks are taken is to make the data explicitly read only.
Queries that need point-in-time consistency (typically reports) should use SNAPSHOT or SERIALIZABLE. If the point-in-time requirement can be limited to a single statement, RCSI is enough. None of the other isolation levels will do.
The default (locking READCOMMITTED) is considerably weaker than most people appreciate. READUNCOMMITTED is weaker yet, but not by the large amount you might be led to think by reading popular NOLOCK-is-evil writings.
WITH (NOLOCK) is indeed dumb (and possibly evil). SET TRANSACTION ISOLATION LEVEL READUNCOMMITTED is not, at least not necessarily.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 6, 2014 at 12:28 pm
Paul White (5/6/2014)
Stefan Krzywicki (5/6/2014)
I'm trying to make sure I understand it well enough to explain it because I'm trying to convince people to stop using it. They always have a ton of reasons why they think it is good or at least doesn't hurt anything.I have no problem at all with people that use the read uncommitted isolation level, so long as it is an informed choice. I do have a serious problem with people that add WITH (NOLOCK) to every table in a query. We would probably see this practice a lot less if the NOLOCK synonym for READUNCOMMITTED did not exist. It is an isolation level hint, not an instruction to take no locks during processing! Backward compatibility has a lot to answer for.
There are five perfectly valid isolation levels in SQL Server, each of which provides a different protections against concurrency effects. READ UNCOMMITTED is the lowest of these, so it should come as no surprise that anomalies are most prevalent there. In addition, SQL Server has some specific behaviours under READUNCOMMITTED that arguably go beyond the spirit of reading uncommitted data. I find the prospect of reading inconsistent parts of a LOB value (like an xml column) and error 601 particularly troubling in this regard.
To repeat what Gail said, a page split may move other rows on the same page, not just the row that caused the split. A concurrent allocation order scan might miss these rows or encounter them multiple times. The unlucky rows are innocent bystanders in this scenario - they might never have changed since they were first added.
That said, not all scans under READUNCOMMITTED are allocation order scans. They have to be expected to be > 64 pages, and the query processor must not have requested an ordered scan (plan property Ordered:True).
Reading rows multiple times and/or missing them is not unique to READUNCOMMITTED, though the page-splitting versus allocation order scan instance of the phenomenon is. We can multi-read or miss rows under locking READCOMMITTED, and miss rows under REPEATABLEREAD. All you have to do is read from an index structure that is experiencing concurrent key modifications.
I'm not sure what to make of the statement that NOLOCK is redundant if the data is not changing. It is true that row-level shared locks might be skipped by the storage engine if it knows for sure the page has not changed, but higher-level intent-share locks are still taken. The main way to ensure no locks are taken is to make the data explicitly read only.
Queries that need point-in-time consistency (typically reports) should use SNAPSHOT or SERIALIZABLE. If the point-in-time requirement can be limited to a single statement, RCSI is enough. None of the other isolation levels will do.
The default (locking READCOMMITTED) is considerably weaker than most people appreciate. READUNCOMMITTED is weaker yet, but not by the large amount you might be led to think by reading popular NOLOCK-is-evil writings.
WITH (NOLOCK) is indeed dumb (and possibly evil). SET TRANSACTION ISOLATION LEVEL READUNCOMMITTED is not, at least not necessarily.
Thank You.
What I'm trying to argue against is indeed NOLOCK hints on every table in the query. And reasoning of "It'll make it faster" with no analysis and no checks to see if any blocking is happening.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 6, 2014 at 12:44 pm
SQLRNNR (5/6/2014)
hisakimatama (5/1/2014)
Whew, busy times at work lately. I've been out of The Thread for a good bit :-).Anyone else from The Thread planning to attend this week's SQL Saturday in Atlanta? I noticed that Grant and Jason are both presenting a session there; I'll definitely be attending both! It's the first time I'll be at a programming-based activity, so I'm giddy about loading up with plenty of knowledge 😀
Better late than never - but hope you enjoyed the event.
Hope you enjoyed my session as well. Not sure if we had a chance to meet. Plenty of people stopping me.
Oh, I quite enjoyed your session also :-D. Very informative, and Wayne's car hunt in the middle of it was thoroughly amusing. I would've stopped by afterwards for a chat, but my brain was pretty much mush at that point; I'd hardly slept the night before (very noisy hotel neighbors, argh!), and I lucked out on forming a coherent sentence when I talked to Grant. I figured I shouldn't risk making myself appear to be a complete buffoon, since I'm already partially one :-P.
- 😀
May 6, 2014 at 6:10 pm
Stefan Krzywicki (5/6/2014)
GilaMonster (5/6/2014)
No, it was not me that pointed it out, because for a page split to occur, there must be an insert/update operation occurring. SQL doesn't randomly, when it's bored, move pages around in tables.If there are no concurrent changes occurring, then page splits cannot occur (they occur when an insert/update doesn't have enough space on the page) and it's page splits which cause the duplicate rows/missing rows under read uncommitted.
So maybe I did remember it wrong? Maybe it is just that if a split has occurred, NOLOCK can return bad data, even if the split happened awhile ago? From the reading I've been doing if the table is static, there's no point in NOLOCK anyway because locks that it avoids won't be happening?
Wasn't there also some sort of issue across partitioned tables?
Perhaps I'm remembering it wrong.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 43,846 through 43,860 (of 66,688 total)
You must be logged in to reply to this topic. Login to reply