December 28, 2011 at 9:58 am
GSquared (12/28/2011)No, I agree with you. In a database that's loaded purely by a scheduled ETL process (or more than one), there's no reason to worry about dirty reads, except during the load process. Lock the database (single-user) during that time period, and you can set all the queries in it to Read Uncommitted isolation level, and you won't hurt anything.
Generally speaking, you won't help anything either, because read locks can share resources, so all you're really doing by allowing NoLock/RU is making that explicit instead of implicit.
...
Any of these are just ways to avoid blocking issues in OLAP. But snapshot isolation shouldn't really be needed in OLAP, since snapshot isolation is about allowing optimistic read locking to avoid conflicts with concurrent, frequent updates. OLAP databases generally don't have a lot of update concurrency with their reads. That's the whole point of OLAP, actually.
As a theoretical point, I agree with you. As I think about my particular situation, though, I realize that I don't really have an OLAP data warehouse. I don't have an OLTP database, either. Nor is it a Star-Schema data-mart. Actually, I have what is typically known as a BBOS (Big Ball of String). :pinch:
I'd like to use Alexander's solution to the Gordian Knot problem, but I'm afraid that is not an option. :pinch:
My head is really beginning to ache.
December 28, 2011 at 9:59 am
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 28, 2011 at 10:05 am
David Moutray (12/28/2011)
Dev (12/28/2011)Let me point you to another David...
http://www.sqlservercentral.com/Forums/Topic1225577-391-1.aspx
David Portas makes a good point in his post. (I was following that thread until you guys started talking about drawers. The analogies became too thick for me to breathe properly. :-))
I think I might enjoy working with a normalized data warehouse. I think Star-Schema is also a valid model, in that it is optimized for reporting.
The data warehouse with which I currently work does not follow any coherent model. Therein lies my problem. :pinch:
David supports David. Let me find another Dev :pinch:
December 28, 2011 at 10:07 am
GSquared (12/28/2011)
...What they really mean is, "this isn't a database that we know what we're talking about, so shut up and leave us in our ignorant bliss!"
Well said.
December 28, 2011 at 10:09 am
David Moutray (12/28/2011)
GSquared (12/28/2011)No, I agree with you. In a database that's loaded purely by a scheduled ETL process (or more than one), there's no reason to worry about dirty reads, except during the load process. Lock the database (single-user) during that time period, and you can set all the queries in it to Read Uncommitted isolation level, and you won't hurt anything.
Generally speaking, you won't help anything either, because read locks can share resources, so all you're really doing by allowing NoLock/RU is making that explicit instead of implicit.
...
Any of these are just ways to avoid blocking issues in OLAP. But snapshot isolation shouldn't really be needed in OLAP, since snapshot isolation is about allowing optimistic read locking to avoid conflicts with concurrent, frequent updates. OLAP databases generally don't have a lot of update concurrency with their reads. That's the whole point of OLAP, actually.
As a theoretical point, I agree with you. As I think about my particular situation, though, I realize that I don't really have an OLAP data warehouse. I don't have an OLTP database, either. Nor is it a Star-Schema data-mart. Actually, I have what is typically known as a BBOS (Big Ball of String). :pinch:
I'd like to use Alexander's solution to the Gordian Knot problem, but I'm afraid that is not an option. :pinch:
My head is really beginning to ache.
Yeah, the pull-out-a-sword-and-start-cutting-things-in-half solution has its temptations, that's for sure. Just keep in mind, doing so on the devs could have negative repurcussions beyond the magnitude of "totally worth it". 🙂
I'm gathering it's a semi-transactional EAV store that you're dealing with. That and the NoLock addiction speak to a database designed and implemented by OO devs who not only don't know RDB basics, but consider relational databases as a major headache and one of the biggest impediments to getting their real job done. They don't know about RDBs, and they don't want to know.
The only real solution to that is behind the scenes refactoring. Even that can only be done if the database APIs are stored procedures, not inline code. In most EAV datastores, the API is inline code, so you can't even refactor that, and are left with no real solution beyond telling managers to hire a better class of devs.
If you can get management backup, you can start forcing refactoring that will allow you to repair the performance and integrity issues in the data layer. You'll have to do some homework, proving that alternative solutions (like actual OLAP or OLTP data structures, instead of EAV) will perform better, and so on. There's a lot of data available online about EAV or "one true lookup table" solutions failing horribly, and you can use those articles to back up some real metrics. But you'll need actual metrics first.
Then it'll be up to management/executive bodies to decide whether to spend the time/money/opportunity costs/et al, to fix those metrics. If they do, you're golden. If they don't, or if you can't build the metrics in the first place, you'll have career decisions to make about the whole thing. Data structures and performance won't be your problem if you're just a database admin, and you can migrate that way with your current employer. Or try to find greener pastures.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 28, 2011 at 1:10 pm
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
Well, the title of the thread is the following, so that is what I was addressing:
"Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?"
But as I pointed out, there are specific areas where it would be an advantage:
There would not be a need to use NOLOCK hints.
It would allow transactionally consistent queries to run without being blocked while load operations are running.
If the database does have EVA tables, RCS would be the first solution I would try, since EVA is notorious for massive blocking (among a thousand other serious disadvantages).
December 28, 2011 at 1:35 pm
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
Well, the title of the thread is the following, so that is what I was addressing:
"Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?"
But as I pointed out, there are specific areas where it would be an advantage:
There would not be a need to use NOLOCK hints.
It would allow transactionally consistent queries to run without being blocked while load operations are running.
If the database does have EVA tables, RCS would be the first solution I would try, since EVA is notorious for massive blocking (among a thousand other serious disadvantages).
I consider a non-solution to be a problem itself.
It requires work to implement (opportunity cost at the very least). There's an expectation that it solves something and expectation-bias will cause judgement of it to be rosier than it should be, thus delaying actual solutions being implemented until that wears off. When leadership is lead to believe it will solve something, and then it doesn't, this causes later, actual solutions, to be harder to "sell". All those are liabilities to placebo solutions. There are others, but those by themselves are enough to make me warn away from such things.
If they're already using NoLock, then "solving" concurency on read access won't be likely to have any measurable effect on performance. It will potentially help with dirty reads, and with errors caused by page splits/moves. But snapshots can cause their own problems with a version of dirty reads, whereby data that is in the process of being deleted or update is still readable by other connections for the duration of the transaction. Since EAV tables often require very long transactions, all this does is move from one type of dirty read to another type. Again, a non-solution, or at best a placebo, with the costs I listed above.
Time spent on placebos is time that is permanently lost.
Even uglier can be the conversation with management that goes, "You told us this first thing would help, and we backed you up on getting it done over objections from our other developers and even your own manager. Now you say the whole project accomplished nothing useful and you want to try something else. Please explain why we should back you up, again, and pay you to do a project twice." The OP has already indicated that there's push-back from devs and at least one manager on this whole project, if I'm reading his posts correctly. Considering this consequence, I'd want to be very, very sure the project would be a stellar success and not a fizzle that needs to be redone in a whole different manner. Do it right, and management will have your back forever. Do it wrong, and you may as well resign right then and there.
Given that, I consider a non-solution a potential career-affecting move, with potentially serious negative consequences.
Are you sure you would still advocate in favor of it, considering beyond just the technical ramifications?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 28, 2011 at 2:29 pm
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
Well, the title of the thread is the following, so that is what I was addressing:
"Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?"
But as I pointed out, there are specific areas where it would be an advantage:
There would not be a need to use NOLOCK hints.
It would allow transactionally consistent queries to run without being blocked while load operations are running.
If the database does have EVA tables, RCS would be the first solution I would try, since EVA is notorious for massive blocking (among a thousand other serious disadvantages).
I consider a non-solution to be a problem itself.
It requires work to implement (opportunity cost at the very least). There's an expectation that it solves something and expectation-bias will cause judgement of it to be rosier than it should be, thus delaying actual solutions being implemented until that wears off. When leadership is lead to believe it will solve something, and then it doesn't, this causes later, actual solutions, to be harder to "sell". All those are liabilities to placebo solutions. There are others, but those by themselves are enough to make me warn away from such things.
If they're already using NoLock, then "solving" concurency on read access won't be likely to have any measurable effect on performance. It will potentially help with dirty reads, and with errors caused by page splits/moves. But snapshots can cause their own problems with a version of dirty reads, whereby data that is in the process of being deleted or update is still readable by other connections for the duration of the transaction. Since EAV tables often require very long transactions, all this does is move from one type of dirty read to another type. Again, a non-solution, or at best a placebo, with the costs I listed above.
Time spent on placebos is time that is permanently lost.
Even uglier can be the conversation with management that goes, "You told us this first thing would help, and we backed you up on getting it done over objections from our other developers and even your own manager. Now you say the whole project accomplished nothing useful and you want to try something else. Please explain why we should back you up, again, and pay you to do a project twice." The OP has already indicated that there's push-back from devs and at least one manager on this whole project, if I'm reading his posts correctly. Considering this consequence, I'd want to be very, very sure the project would be a stellar success and not a fizzle that needs to be redone in a whole different manner. Do it right, and management will have your back forever. Do it wrong, and you may as well resign right then and there.
Given that, I consider a non-solution a potential career-affecting move, with potentially serious negative consequences.
Are you sure you would still advocate in favor of it, considering beyond just the technical ramifications?
Addressing the one technical statement you just made, I disagree that a snapshot (row version isolation) allows "dirty reads". They present a transactionally consistent view of the committed data at the point in time in time that you start the query, as opposed to being blocked until the insert/update/delete operation completes to give a transactionally consistent view of the committed data at that point in time.
Setting snapshot isolation and read_committed_snapshot on is a very low cost project taking a couple of minutes to complete, so I don’t see much potential for management disappointment. I never said it will solve every problem in that database, so I don’t see the relevance of the rest of your post, and don’t really want to get involved in discussion in some imaginary internal political struggle that neither of us really knows anything about.
December 28, 2011 at 3:13 pm
RCSI wouldn't permit dirty reads, but it would permit non-repeatable reads and phantom reads. Since the required overhead for the Temp DB is significant, I want to have some confidence that RCSI would provide at least some benefit.
It actually sounds as though I would be better off simply forbidding WITH(NOLOCK) and fixing any resulting performance issues as they arise.
And, yes, WITH(NOLOCK) has created problems. There are ETL processes that run throughout the day. One process had been failing intermittently for years. WITH(NOLOCK) was the cause. (Finding that did make me look pretty good. :-))
December 28, 2011 at 4:40 pm
David Moutray (12/28/2011)
RCSI wouldn't permit dirty reads, but it would permit non-repeatable reads and phantom reads. Since the required overhead for the Temp DB is significant, I want to have some confidence that RCSI would provide at least some benefit.It actually sounds as though I would be better off simply forbidding WITH(NOLOCK) and fixing any resulting performance issues as they arise.
And, yes, WITH(NOLOCK) has created problems. There are ETL processes that run throughout the day. One process had been failing intermittently for years. WITH(NOLOCK) was the cause. (Finding that did make me look pretty good. :-))
RCS and most other isolation levels allow non-repeatable reads.
I disagree that is allows "phantom reads" if by that you mean that you will be able to see data that has not been committed.
I think the advantage of RCS in your situation is that your read queries will not be blocked, which is probably why developers were resorting to NOLOCK to begin with. Compounded with the notorious blocking problems with EAV designs, you don't have many solutions to try that will be as easy to implement as RCS.
The impact on tempdb may not be that serious an issue, but that is something you need look at for yourself. I have never seen issues with tempdb performance on any server where I have implemented RCS, but I am not running your application.
I have almost always seen better application performance with RCS, and have never seen worse performance. It also avoids most deadlocks if that is an issue for you.
December 29, 2011 at 6:26 am
David Moutray (12/28/2011)
RCSI wouldn't permit dirty reads, but it would permit non-repeatable reads and phantom reads. Since the required overhead for the Temp DB is significant, I want to have some confidence that RCSI would provide at least some benefit.It actually sounds as though I would be better off simply forbidding WITH(NOLOCK) and fixing any resulting performance issues as they arise.
And, yes, WITH(NOLOCK) has created problems. There are ETL processes that run throughout the day. One process had been failing intermittently for years. WITH(NOLOCK) was the cause. (Finding that did make me look pretty good. :-))
You might note that I defined exactly what I meant by "not real dirty reads, but a related problem nonetheless" in my post.
Good on finding that issue with NoLock.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 29, 2011 at 6:33 am
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
Well, the title of the thread is the following, so that is what I was addressing:
"Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?"
But as I pointed out, there are specific areas where it would be an advantage:
There would not be a need to use NOLOCK hints.
It would allow transactionally consistent queries to run without being blocked while load operations are running.
If the database does have EVA tables, RCS would be the first solution I would try, since EVA is notorious for massive blocking (among a thousand other serious disadvantages).
I consider a non-solution to be a problem itself.
It requires work to implement (opportunity cost at the very least). There's an expectation that it solves something and expectation-bias will cause judgement of it to be rosier than it should be, thus delaying actual solutions being implemented until that wears off. When leadership is lead to believe it will solve something, and then it doesn't, this causes later, actual solutions, to be harder to "sell". All those are liabilities to placebo solutions. There are others, but those by themselves are enough to make me warn away from such things.
If they're already using NoLock, then "solving" concurency on read access won't be likely to have any measurable effect on performance. It will potentially help with dirty reads, and with errors caused by page splits/moves. But snapshots can cause their own problems with a version of dirty reads, whereby data that is in the process of being deleted or update is still readable by other connections for the duration of the transaction. Since EAV tables often require very long transactions, all this does is move from one type of dirty read to another type. Again, a non-solution, or at best a placebo, with the costs I listed above.
Time spent on placebos is time that is permanently lost.
Even uglier can be the conversation with management that goes, "You told us this first thing would help, and we backed you up on getting it done over objections from our other developers and even your own manager. Now you say the whole project accomplished nothing useful and you want to try something else. Please explain why we should back you up, again, and pay you to do a project twice." The OP has already indicated that there's push-back from devs and at least one manager on this whole project, if I'm reading his posts correctly. Considering this consequence, I'd want to be very, very sure the project would be a stellar success and not a fizzle that needs to be redone in a whole different manner. Do it right, and management will have your back forever. Do it wrong, and you may as well resign right then and there.
Given that, I consider a non-solution a potential career-affecting move, with potentially serious negative consequences.
Are you sure you would still advocate in favor of it, considering beyond just the technical ramifications?
Addressing the one technical statement you just made, I disagree that a snapshot (row version isolation) allows "dirty reads". They present a transactionally consistent view of the committed data at the point in time in time that you start the query, as opposed to being blocked until the insert/update/delete operation completes to give a transactionally consistent view of the committed data at that point in time.
Setting snapshot isolation and read_committed_snapshot on is a very low cost project taking a couple of minutes to complete, so I don’t see much potential for management disappointment. I never said it will solve every problem in that database, so I don’t see the relevance of the rest of your post, and don’t really want to get involved in discussion in some imaginary internal political struggle that neither of us really knows anything about.
Really?
You really think that modify the whole way that transactionality, locking/blocking/deadlocking is handled in the whole database, in a production environment that's already in use, is a few minute project with no real cost to it?
Just make sure you stay away from any actual DBA jobs if you really think that major a change, on the fly, without the necessary homework to make sure it won't have negative unintended consequences or side-effects, is a good plan.
Yeah, setting the actual command is a few seconds of work. Set the database to single-user (2 seconds of typing, if that), issue the alter database command (another 2 seconds), and return to multi-user. Execute that script, wait for any rollbacks to complete (if you chose that option) or any transactions to commit, and you're probably looking at about half a minute's work, tops. But that's like driving with your eyes shut, when it comes to the actual job of a DBA.
Plus, it ignores all the refactoring needed to take those NoLocks out of all existing code, all the testing (regression and load testing at least) to make sure all that refactored code still does what it's supposed to in the changed environment, and the necessary roll-forward through QA and test environments into production, for all of that.
So, honestly, and somewhat brutally perhaps, I have to say, if you really think that way, you need a different career.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 29, 2011 at 8:27 am
GSquared (12/29/2011)
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
Well, the title of the thread is the following, so that is what I was addressing:
"Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?"
But as I pointed out, there are specific areas where it would be an advantage:
There would not be a need to use NOLOCK hints.
It would allow transactionally consistent queries to run without being blocked while load operations are running.
If the database does have EVA tables, RCS would be the first solution I would try, since EVA is notorious for massive blocking (among a thousand other serious disadvantages).
I consider a non-solution to be a problem itself.
It requires work to implement (opportunity cost at the very least). There's an expectation that it solves something and expectation-bias will cause judgement of it to be rosier than it should be, thus delaying actual solutions being implemented until that wears off. When leadership is lead to believe it will solve something, and then it doesn't, this causes later, actual solutions, to be harder to "sell". All those are liabilities to placebo solutions. There are others, but those by themselves are enough to make me warn away from such things.
If they're already using NoLock, then "solving" concurency on read access won't be likely to have any measurable effect on performance. It will potentially help with dirty reads, and with errors caused by page splits/moves. But snapshots can cause their own problems with a version of dirty reads, whereby data that is in the process of being deleted or update is still readable by other connections for the duration of the transaction. Since EAV tables often require very long transactions, all this does is move from one type of dirty read to another type. Again, a non-solution, or at best a placebo, with the costs I listed above.
Time spent on placebos is time that is permanently lost.
Even uglier can be the conversation with management that goes, "You told us this first thing would help, and we backed you up on getting it done over objections from our other developers and even your own manager. Now you say the whole project accomplished nothing useful and you want to try something else. Please explain why we should back you up, again, and pay you to do a project twice." The OP has already indicated that there's push-back from devs and at least one manager on this whole project, if I'm reading his posts correctly. Considering this consequence, I'd want to be very, very sure the project would be a stellar success and not a fizzle that needs to be redone in a whole different manner. Do it right, and management will have your back forever. Do it wrong, and you may as well resign right then and there.
Given that, I consider a non-solution a potential career-affecting move, with potentially serious negative consequences.
Are you sure you would still advocate in favor of it, considering beyond just the technical ramifications?
Addressing the one technical statement you just made, I disagree that a snapshot (row version isolation) allows "dirty reads". They present a transactionally consistent view of the committed data at the point in time in time that you start the query, as opposed to being blocked until the insert/update/delete operation completes to give a transactionally consistent view of the committed data at that point in time.
Setting snapshot isolation and read_committed_snapshot on is a very low cost project taking a couple of minutes to complete, so I don’t see much potential for management disappointment. I never said it will solve every problem in that database, so I don’t see the relevance of the rest of your post, and don’t really want to get involved in discussion in some imaginary internal political struggle that neither of us really knows anything about.
Really?
You really think that modify the whole way that transactionality, locking/blocking/deadlocking is handled in the whole database, in a production environment that's already in use, is a few minute project with no real cost to it?
Just make sure you stay away from any actual DBA jobs if you really think that major a change, on the fly, without the necessary homework to make sure it won't have negative unintended consequences or side-effects, is a good plan.
Yeah, setting the actual command is a few seconds of work. Set the database to single-user (2 seconds of typing, if that), issue the alter database command (another 2 seconds), and return to multi-user. Execute that script, wait for any rollbacks to complete (if you chose that option) or any transactions to commit, and you're probably looking at about half a minute's work, tops. But that's like driving with your eyes shut, when it comes to the actual job of a DBA.
Plus, it ignores all the refactoring needed to take those NoLocks out of all existing code, all the testing (regression and load testing at least) to make sure all that refactored code still does what it's supposed to in the changed environment, and the necessary roll-forward through QA and test environments into production, for all of that.
So, honestly, and somewhat brutally perhaps, I have to say, if you really think that way, you need a different career.
I don't understand the venom of in your last post, or the reason for such a personal level of attack ("...stay away from any actual DBA jobs...","...you need a different career..."). I am fairly certain that I never said anything as insulting as that to you.
Do you really think that kind of language is contiributing to what should be a technical discussion? If you can't defend your opinions without resorting to personal attacks, I don't see the point of continuing the discussion with you.
December 29, 2011 at 8:36 am
This is the point when I need to step in and ask that we all remain positive and pleasant. 🙂
All of you are helping me by offering your technical opinions based on your experience and professional knowledge. I would hate to discourage anyone from doing that. 🙂
Both Gus and Michael have given me valuable information. It sems to me, based on the discussion here, that RCSI is unlikely to help my situation much. It would probably be good for me to eliminate WITH(NOLOCK) wherever I can, and continue to monitor performance.
RCSI might be a good solution if I see locking problems after that point. Right now, however, it is difficult to say for certain.
I want to thank everyone who has taken part in this discussion. When I first asked the question, I did not realize how deep the topic could really get. You have all given me a lot to think about. 🙂
Live Long, and Prosper. :alien:
December 29, 2011 at 8:47 am
Michael Valentine Jones (12/29/2011)
GSquared (12/29/2011)
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
GSquared (12/28/2011)
Michael Valentine Jones (12/28/2011)
I don't see how RCS would be a problem in a data warehouse\datamart application.If there are no inserts/updates/deletes happening, then there are no row versions being generated in the version store in tempdb, so there would be no additional load there. Select queries would not create locks, except for a schema stability lock.
If the data warehouse is being updated while queries are being run, then single select statements would be able to run with consistent results without being blocked. For multiple statement queries, then SNAPSHOT isolation could be used to maintain consistency without being blocked. In either case, the consistency of the results would be far better than using NOLOCK, which is the current situation.
It's not that it would be a problem, it's that it's not likely to be a solution.
Well, the title of the thread is the following, so that is what I was addressing:
"Is There Any Reason I Should NOT enable READ_COMMITTED_SNAPSHOT?"
But as I pointed out, there are specific areas where it would be an advantage:
There would not be a need to use NOLOCK hints.
It would allow transactionally consistent queries to run without being blocked while load operations are running.
If the database does have EVA tables, RCS would be the first solution I would try, since EVA is notorious for massive blocking (among a thousand other serious disadvantages).
I consider a non-solution to be a problem itself.
It requires work to implement (opportunity cost at the very least). There's an expectation that it solves something and expectation-bias will cause judgement of it to be rosier than it should be, thus delaying actual solutions being implemented until that wears off. When leadership is lead to believe it will solve something, and then it doesn't, this causes later, actual solutions, to be harder to "sell". All those are liabilities to placebo solutions. There are others, but those by themselves are enough to make me warn away from such things.
If they're already using NoLock, then "solving" concurency on read access won't be likely to have any measurable effect on performance. It will potentially help with dirty reads, and with errors caused by page splits/moves. But snapshots can cause their own problems with a version of dirty reads, whereby data that is in the process of being deleted or update is still readable by other connections for the duration of the transaction. Since EAV tables often require very long transactions, all this does is move from one type of dirty read to another type. Again, a non-solution, or at best a placebo, with the costs I listed above.
Time spent on placebos is time that is permanently lost.
Even uglier can be the conversation with management that goes, "You told us this first thing would help, and we backed you up on getting it done over objections from our other developers and even your own manager. Now you say the whole project accomplished nothing useful and you want to try something else. Please explain why we should back you up, again, and pay you to do a project twice." The OP has already indicated that there's push-back from devs and at least one manager on this whole project, if I'm reading his posts correctly. Considering this consequence, I'd want to be very, very sure the project would be a stellar success and not a fizzle that needs to be redone in a whole different manner. Do it right, and management will have your back forever. Do it wrong, and you may as well resign right then and there.
Given that, I consider a non-solution a potential career-affecting move, with potentially serious negative consequences.
Are you sure you would still advocate in favor of it, considering beyond just the technical ramifications?
Addressing the one technical statement you just made, I disagree that a snapshot (row version isolation) allows "dirty reads". They present a transactionally consistent view of the committed data at the point in time in time that you start the query, as opposed to being blocked until the insert/update/delete operation completes to give a transactionally consistent view of the committed data at that point in time.
Setting snapshot isolation and read_committed_snapshot on is a very low cost project taking a couple of minutes to complete, so I don’t see much potential for management disappointment. I never said it will solve every problem in that database, so I don’t see the relevance of the rest of your post, and don’t really want to get involved in discussion in some imaginary internal political struggle that neither of us really knows anything about.
Really?
You really think that modify the whole way that transactionality, locking/blocking/deadlocking is handled in the whole database, in a production environment that's already in use, is a few minute project with no real cost to it?
Just make sure you stay away from any actual DBA jobs if you really think that major a change, on the fly, without the necessary homework to make sure it won't have negative unintended consequences or side-effects, is a good plan.
Yeah, setting the actual command is a few seconds of work. Set the database to single-user (2 seconds of typing, if that), issue the alter database command (another 2 seconds), and return to multi-user. Execute that script, wait for any rollbacks to complete (if you chose that option) or any transactions to commit, and you're probably looking at about half a minute's work, tops. But that's like driving with your eyes shut, when it comes to the actual job of a DBA.
Plus, it ignores all the refactoring needed to take those NoLocks out of all existing code, all the testing (regression and load testing at least) to make sure all that refactored code still does what it's supposed to in the changed environment, and the necessary roll-forward through QA and test environments into production, for all of that.
So, honestly, and somewhat brutally perhaps, I have to say, if you really think that way, you need a different career.
I don't understand the venom of in your last post, or the reason for such a personal level of attack ("...stay away from any actual DBA jobs...","...you need a different career..."). I am fairly certain that I never said anything as insulting as that to you.
Do you really think that kind of language is contiributing to what should be a technical discussion? If you can't defend your opinions without resorting to personal attacks, I don't see the point of continuing the discussion with you.
I am in no way insulting you. I'm saying, if you really think major changes to a production database don't require any significant level of work, just because the change itself can be applied in 30 seconds of scripting, without doing any of the homework and testing necessary to make sure you won't break anything, then being a DBA is not the job for someone with that attitude.
I really would compare that to the DBA equivalent to driving with your eyes closed. Seriously. It's that bad a habit, professionally.
I expect you really don't think that way, and the advice to change careers thus isn't applicable.
I'm pointing out what I consider a SERIOUS flaw in your statement that changing to RCS for a production database is a nothing project. I suspect you looked at how easy it is to apply the command, without taking into account that there's a LOT more to such a project than just applying the Alter Database script.
The OP asked if there are reasons to not apply RCS. The amount of work that needs to be done to make sure applying it won't break anything is substantial, and the gains from applying it, in the described scenario, are probably zero or nearly zero. Thus, it's a lot of work for no or very little gain. That's a reason to not do it.
When there's no good reason to do something, and there's a lot of work in doing it, that is, by itself, a reason to not do it.
Add to that the risks of a major change in a production system. Those can't always be accounted for, and some can't easily be repaired if they are found out much later. So, we're talking potentially high risk, for no expected gain. Again, a solid reason to not do the thing.
Your answer discounted both of those factors. I pointed out that discounting them isn't wise, and that consciously choosing to discount them isn't appropriate for a DBA.
That's not an insult. That's advice. You can take it personally and be offended and ignore me. That's your choice. Can't deny you that option. I wouldn't consider it wise to treat my statement that way. It's advice to take the single most important factor of a DBA's job a little more seriously, or don't take on that job. Anyone, you or whomever, who doesn't consider the ramifications of data protection, work vs reward, risk vs reward, in every project that's worked on, isn't suited for this kind of work. You wouldn't want a surgeon to work on you who discounted the most important aspect of his job (keep the patient alive). Why would you accept a DBA who disregards the most important aspect of his (protect the data)?
I strongly suspect you wouldn't accept that, and that the "step away from the databases with your hands in plain view" statement thus doesn't apply to you.
But you're the one saying it does. Not me. You.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 15 posts - 16 through 30 (of 70 total)
You must be logged in to reply to this topic. Login to reply