January 4, 2012 at 1:41 pm
it is tempting to assume that data integrity should always be enforced by the database
Pretty sure that's not the argument. I wouldn't use the word "always". Merely "where possible". If column_A in table_B should only sensibly take values between 1 and 10, but within that should only be a 1 when certain combinations of data in other tables exist, should you throw out the constraint on column_A altogether because it's not sophisticated enough?
[Changing tack]
Of course, if I'm a really clever developer, I'll employ super-flexible "multi-purpose" tables where a lot of columns are set to varchar because they need to accommodate a variety of data types. Then I have a really strong argument to throw out constraints... don't I? I mean... it's the constraints that are the problem, not my design... surely.
...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell
January 4, 2012 at 2:02 pm
sqlvogel (1/4/2012)
GSquared (1/4/2012)
For each pseudo-ID, there was only ever supposed to be one row marked active, and all the rest were to have a 0 in that column.How would you enforce that in SQL Server? I'm pretty sure it can't be done reliably with CHECK constraints. Indexed views and filtered unique indexes won't stop you updating all rows to be Active=0. Would you use triggers?
All rows set to 0 = row deleted, in this kind of schema. Nothing wrong with that. Sorry for the ambiguity in the statement. Should have been:
For each pseudo-ID, there was only ever supposed to be one row marked active, and all the rest were to have a 0 in that column, or all rows marked 0 for "deleted" data.
That's kind of mentioned later in my post, but not clearly enough.
Filtered unique index was the solution I recommended to them.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 4, 2012 at 11:40 pm
Thanks, a blindingly obvious solution, only you need to know about it. I didn't know you could filter the index because the last time I REALLY looked at an index, you couldn't and didn't encounter one using a where clause since then.
Since their cleanup process was obviously flawed, did they implement the suggestion and did they run into problems with the process if they did? Since the data is not unique to begin with how did you handle creating the index? Create a new table with the index and copy into it, then replace the table. Determine the unique rows, set everything else to 0, when it fails to create, try again?
Looked up something: The Database Engine does not allow creating a unique index on columns that already include duplicate values, whether or not IGNORE_DUP_KEY is set to ON. If this is tried, the Database Engine displays an error message. Duplicate values must be removed before a unique index can be created on the column or columns.
It can't be clustered on a filtered index can it?
January 5, 2012 at 1:34 am
@GSquared That's a really good example of where constraints would have saved a catastrophe. The basic design wasn't at fault, I reckon, but the implementation sure was! Thinking of your description...
they were totally certain it was working perfectly until I did a simple Select ... Where Active = 1 Group By pseudo-ID Having Count(*) > 1,
.. I put into every database I write an 'impossible' stored procedure that runs on the schedule (daily) and at startup, double-checking all these rules across the database, and emailing me if anything fails a check. As well as making sure I sleep at night, the code is a very handy reminder of some of the more arcane domain rules in the database. Has it ever emailed me? er.. Yes. The impossible always happens in IT.
@ken There was a great article by Alex Kuznetsov here Yet another SQL Strategy for Versioned Data that explains this technique using filtered indexes that can be used for all sorts of temporal databases.
Best wishes,
Phil Factor
January 5, 2012 at 6:26 am
Ken Lee-263418 (1/4/2012)
Thanks, a blindingly obvious solution, only you need to know about it. I didn't know you could filter the index because the last time I REALLY looked at an index, you couldn't and didn't encounter one using a where clause since then.Since their cleanup process was obviously flawed, did they implement the suggestion and did they run into problems with the process if they did? Since the data is not unique to begin with how did you handle creating the index? Create a new table with the index and copy into it, then replace the table. Determine the unique rows, set everything else to 0, when it fails to create, try again?
Looked up something: The Database Engine does not allow creating a unique index on columns that already include duplicate values, whether or not IGNORE_DUP_KEY is set to ON. If this is tried, the Database Engine displays an error message. Duplicate values must be removed before a unique index can be created on the column or columns.
It can't be clustered on a filtered index can it?
Clean up of the existing data would have been easy enough. For each ID with multiple actives, keep the one with the latest data (there were easy ways to determine that based on audit trails), and set the rest to 0. Then add the filtered unique index to prevent future issues.
Would cause whatever flawed code was creating the duplicates in the first place to start throwing errors. But that's a good thing long-term, because they had no faintest clue what code was causing the violations, so errors in production were the only way they'd ever find it.
I don't know if they implemented my recommendations or not, since I left that place before anything was done on that.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 5 posts - 16 through 19 (of 19 total)
You must be logged in to reply to this topic. Login to reply