April 7, 2010 at 12:45 pm
6. Import the incorrect date as a clearly invalid date (1/1/1200 or 1/1/9999 or whatever) and add the entered date to a note column in the record.
This is basically the same as your #3 without the NULL, and better I think than your 4 or 5, because you are not presenting a false assurance of correctness.
No matter which way you go there should be a note attached to the record that there was an error importing the value (bad value).
--
JimFive
[Changed for Spelling]
April 7, 2010 at 12:56 pm
RalphWilson (4/7/2010)
Whil;e I agree that data should not be changed without careful consideration, I also have to deal with the very problem discussed in the editorial and yet I am required to insert the "date" into a Datetime datatype column in SQL Server. Thus the connundrum regarding, if I can't insert the incorrect date, what do I insert?We get data fed from an AS400 that is programmed in COBOL with a DB/2 database. Much of the data is originally fed from outside clients, as well. Since COBOL tends to treat data as either Alphanumeric (thus allowing non-numeric characters) or numeric (restricting to 0-9), there is no protection against invalid dates as long as they have numbers and no letters. However, since the "dates" are actually used as dates in our SQL Server world, we need to receive valid dates . . . but it is often nearly (if not totally) impossible to get our Big Brothers to filter and validate the dates or even to get a "business decision" as to what Feb 29, 2007 or Nov 31 of any year means.
So, our choices boil down to:
1) Reject the record entirely (which is not acceptable to the business);
2) Change all of our date columns to varchars (which I don't find acceptable);
3) Turn the invalid dates to NULL (which gives the business heartburn);
4) Adjust the day part of the date so that it reflects the last day of the indicated month;
5) Adjust month part of the date so that it reflects the next month and the day part of the date so that it represents the excess over the last day of the previous month.
None of those answers are "good"; however, without a decision from the business sector, someone has to do something.
So, what do y'all recommend? 😀 :crying:
Easy. If the business sector of the company can't make a decision, you can't proceed. They have to tell you how to handle invalid data.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
April 7, 2010 at 1:51 pm
James Goodwin (4/7/2010)
" I think mistakes should be made painful for users so that they are more careful next time. "
In this respect it is not necessarily pain and/or punishment we want to inflict on them, but rather guidance and education, right?
Sortof, it's not exactly about pain, but it isn't about guidance either. If I, as the DBA, do the work to research the data and figure out what it should be then two things happen:
1. I get behind on my real job.
2. The cause of the problem never gets addressed.
However, if I hand it off to the supervisor responsible for it then it takes up that person's time and they have an incentive (and the authority) to address the cause.
--
JimFive
You have put what I meant much better. I want the users to aware of the issues with the application so that they will be more careful in the future.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
April 7, 2010 at 1:58 pm
WayneS (4/7/2010)
RalphWilson (4/7/2010)
Whil;e I agree that data should not be changed without careful consideration, I also have to deal with the very problem discussed in the editorial and yet I am required to insert the "date" into a Datetime datatype column in SQL Server. Thus the connundrum regarding, if I can't insert the incorrect date, what do I insert?We get data fed from an AS400 that is programmed in COBOL with a DB/2 database. Much of the data is originally fed from outside clients, as well. Since COBOL tends to treat data as either Alphanumeric (thus allowing non-numeric characters) or numeric (restricting to 0-9), there is no protection against invalid dates as long as they have numbers and no letters. However, since the "dates" are actually used as dates in our SQL Server world, we need to receive valid dates . . . but it is often nearly (if not totally) impossible to get our Big Brothers to filter and validate the dates or even to get a "business decision" as to what Feb 29, 2007 or Nov 31 of any year means.
So, our choices boil down to:
1) Reject the record entirely (which is not acceptable to the business);
2) Change all of our date columns to varchars (which I don't find acceptable);
3) Turn the invalid dates to NULL (which gives the business heartburn);
4) Adjust the day part of the date so that it reflects the last day of the indicated month;
5) Adjust month part of the date so that it reflects the next month and the day part of the date so that it represents the excess over the last day of the previous month.
None of those answers are "good"; however, without a decision from the business sector, someone has to do something.
So, what do y'all recommend? 😀 :crying:
Easy. If the business sector of the company can't make a decision, you can't proceed. They have to tell you how to handle invalid data.
I agree with Wayne on this one. Until the business tells you what to do you are stuck. Present them with the options, and the top one should be fix the source data, and then implement.
I feel your pain because I've been there with DB2 on the 400 and invalid dates.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
April 7, 2010 at 2:47 pm
James Goodwin (4/7/2010)
" I think mistakes should be made painful for users so that they are more careful next time. "
In this respect it is not necessarily pain and/or punishment we want to inflict on them, but rather guidance and education, right?
Sortof, it's not exactly about pain, but it isn't about guidance either. If I, as the DBA, do the work to research the data and figure out what it should be then two things happen:
1. I get behind on my real job.
2. The cause of the problem never gets addressed.
However, if I hand it off to the supervisor responsible for it then it takes up that person's time and they have an incentive (and the authority) to address the cause.
--
JimFive
I guess the guidance part is telling them, from the technical standpoint, what the options are for fixing a problem, not doing the research for them. The education part is exactly what you say, for them to learn the consequences of their sloppy work.
😎 Kate The Great :w00t:
If you don't have time to do it right the first time, where will you find time to do it again?
April 7, 2010 at 3:51 pm
Jack Corbett (4/7/2010)
I agree with Wayne on this one.
But not on others? Jack, you should always agree with me. 😀 😉
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
April 7, 2010 at 6:10 pm
CirquedeSQLeil (4/7/2010)
ben.mcintyre (4/7/2010)
While I agree with the sentiment of this article, I think the most important thing is: if you DO change any data, KEEP AN AUDIT TRAIL ! You MUST be able to undo ALL your changes later.In addition to the audit trail you could go another step. Before making any changes, make sure you have a backup of the data.
Totally agree, I actually took that as a given (whoops).
My point was that a backup, although it contains the original data, doesn't show you what you did or necessarily how to get back to where you were, particularly when you find a problem with the 'fix' two months down the track ... or the person who did the fix has left ... now what was that fix again ... ?
A detailed audit trail does provide exactly that information, and can save a huge amount of time recreating the often complex original scenario.
April 8, 2010 at 3:49 am
hi ben,
its true to keep a track of all the changes done..
And steve has done a good job in raising this type of basic info.
thanks steve.
- Swaroop R -
April 8, 2010 at 8:59 am
nelsonj-902869 (4/7/2010)
Take the time (and bite the bullet) to write a preprocessing module to edit the batch coming into your database and edit the darn field. Any record not passing normal, standard date edits, gets written to a file that you transmit BACK to the AS400 system. Print out an edit report and send the edit report to your boss and the admin in charge of the AS400. Start screaming that something needs to be done.........otherwise, YOU are going to end up monitoring this issue for-e-v-e-r.:cool:
The only flaw in that plan is that the AS400 is the official reference and we who are on the SQL Server ("PC") end of the equation are only allowed to receive data . . . not send any back. Besides which, the data isn't "owned" by even the AS400. It is "owned" by the outside client and, even if we get our Big Borthers to change their data, it'll just get over written the next time the client's data is received. :-/
Ralph D. Wilson II
Development DBA
"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
April 8, 2010 at 10:30 am
RalphWilson (4/8/2010)
nelsonj-902869 (4/7/2010)
Take the time (and bite the bullet) to write a preprocessing module to edit the batch coming into your database and edit the darn field. Any record not passing normal, standard date edits, gets written to a file that you transmit BACK to the AS400 system. Print out an edit report and send the edit report to your boss and the admin in charge of the AS400. Start screaming that something needs to be done.........otherwise, YOU are going to end up monitoring this issue for-e-v-e-r.:cool:
The only flaw in that plan is that the AS400 is the official reference and we who are on the SQL Server ("PC") end of the equation are only allowed to receive data . . . not send any back. Besides which, the data isn't "owned" by even the AS400. It is "owned" by the outside client and, even if we get our Big Borthers to change their data, it'll just get over written the next time the client's data is received. :-/
I feel your pain. The problem is you're between a rock and a hard place. If you massage the data, then you risk giving the client bad info; if you don't, you risk giving them incomplete info. It still comes down to either talking to the external customers to figure out what to do with the bad data, or publish your data scrubbing rules, and make the customers acknowledge what happens when the data gets scrubbed, or c. come up with a way to prevent the data from being accepted unle3ss the criteria is met. I'm sure there are yet other alternatives.
It still comes down to communicating back very clearly what's been done. This becomes more important precisely because it's not your data, or your company's data, and frankly - the more "vital" the data is, the more important it is to be clear as to what "happened" to it in order to arrive to whatever the final result happens to be in your case.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 8, 2010 at 10:48 am
Matt Miller (#4) (4/8/2010)
RalphWilson (4/8/2010)
nelsonj-902869 (4/7/2010)
Take the time (and bite the bullet) to write a preprocessing module to edit the batch coming into your database and edit the darn field. Any record not passing normal, standard date edits, gets written to a file that you transmit BACK to the AS400 system. Print out an edit report and send the edit report to your boss and the admin in charge of the AS400. Start screaming that something needs to be done.........otherwise, YOU are going to end up monitoring this issue for-e-v-e-r.:cool:
The only flaw in that plan is that the AS400 is the official reference and we who are on the SQL Server ("PC") end of the equation are only allowed to receive data . . . not send any back. Besides which, the data isn't "owned" by even the AS400. It is "owned" by the outside client and, even if we get our Big Bothers to change their data, it'll just get over written the next time the client's data is received. :-/
I feel your pain. The problem is you're between a rock and a hard place. If you massage the data, then you risk giving the client bad info; if you don't, you risk giving them incomplete info. It still comes down to either talking to the external customers to figure out what to do with the bad data, or publish your data scrubbing rules, and make the customers acknowledge what happens when the data gets scrubbed, or c. come up with a way to prevent the data from being accepted unless the criteria is met. I'm sure there are yet other alternatives.
It still comes down to communicating back very clearly what's been done. This becomes more important precisely because it's not your data, or your company's data, and frankly - the more "vital" the data is, the more important it is to be clear as to what "happened" to it in order to arrive to whatever the final result happens to be in your case.
Although I didn't' make the decision, I concur with the decision that someone else (who is no longer with the company 😉 made with regard to this data. A UDF was created that accepts the dates as strings and returns the date as a datetime. If the date-string is invalid in certain ways (e.g. embedded alphas or otherwise trashed entirely), a NULL is returned unless a flag is also passed that indicates that a NULL is unacceptable in which case a 0 is returned. If the date is invalid due to exceeding the number of days in the indicated month for the indicated year, then the returned date is the last day of the indicated month unless a flag is also passed indicating that it should be off-set into the next month.
Those rules have been provided to The Powers That Be for comment and advise (of which we have had none 😉 and the defaults are the assumption that no flag was passed. 😉
So far we have only had a couple of questions raised and our explanation has been apparently found to be acceptable.
Ralph D. Wilson II
Development DBA
"Give me 6 hours to chop down a tree and I will spend the first 4 sharpening the ax."
A. Lincoln
April 9, 2010 at 11:52 am
Jack Corbett (4/7/2010)
...One of my coworkers thinks I should make the changes myself instead of "adding work" for the users. I think mistakes should be made painful for users so that they are more careful next time. Yes, the application should verify things better, but it doesn't and I have no control over that.
And if you make the changes yourself, to protect the users from themselves, they'll blame you if anything goes wrong, even if it's not related to the change you made. Remember, the road to hell is paved with good intentions.
🙂
Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein
April 9, 2010 at 11:55 am
Ken Klaft-381933 (4/7/2010)
...You can fix an application but you can't fix stupid. All you can do is prevent it.
Great quote, let them hurt when they make a mistake. That being said, to prevent any headaches for you in the future, hopefully if a DBA is involved in any way in the development (and they really should at some point), is to kindly let them know if anything is potentially wrong, and hopefully there's a developer who will realize what the possible performance ramifications are.
Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein
April 9, 2010 at 11:57 am
Greg Edwards-268690 (4/7/2010)
When you change the data type, you can lose quite a bit. Dates have special functions and edits, and this is probably even more important when integrating data from multiple data sources....
Greg E
I have some colleagues who work with Oracle/SAP. They did a small, incremental, upgrade that SAP assured them would not change any table definitions or data. What happened, a column was modified with a default NULL, it was not caught in QA, and it resulted in the warehouse not shipping out inventory to our stores.
Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein
April 9, 2010 at 2:54 pm
GabyYYZ (4/9/2010)
Greg Edwards-268690 (4/7/2010)
When you change the data type, you can lose quite a bit. Dates have special functions and edits, and this is probably even more important when integrating data from multiple data sources....
Greg E
I have some colleagues who work with Oracle/SAP. They did a small, incremental, upgrade that SAP assured them would not change any table definitions or data. What happened, a column was modified with a default NULL, it was not caught in QA, and it resulted in the warehouse not shipping out inventory to our stores.
Ouch!
NULL kind of means nothing, so it shouldn't have caused a problem. :w00t:
Greg E
Viewing 15 posts - 46 through 60 (of 69 total)
You must be logged in to reply to this topic. Login to reply