June 16, 2009 at 12:43 am
Is there a way to tell that whether or not the extracted data from database is correct?
How to make sure the data extracted from database is accurate and correct?
June 16, 2009 at 1:05 am
Hi,
AFAIK there are no "correct" data within a database. Data can (and should/must) be valid and consistant. But that does not mean they have to be "correct" from users perspective, since there is no such general definition.
If the data you receive from a database are considered "incorrect", multiple reason can apply, e.g.
1) wrong (manual) data input (AKA garbage in = garbage out)
2) poor database design not forcing/checking for data being valid/consistant
3) timing problems for data manipulation (usually together with (2) )
4) corrupted database
5) wrong expectation from end user side ("Data have to show we're making profit! Why doesn't it?")
Just to name some of them...
If you'd describe a little more detailed what problem you're facing the answers might also be alittle more specific.
For details on how to post sample data please see the link in my signature.
June 16, 2009 at 1:09 am
Extracted to what? A CSV file?
Do you mean some sort of verification process? What is your concern - is it that records get missed, or data somehow gets corrupted?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
June 16, 2009 at 4:16 pm
I'm having trouble following you as well. You'll need to tell us a bit more about what you are trying to do. How is the data being extracted and to where? What makes the extracted data correct, incorrect?
June 16, 2009 at 6:46 pm
It is not a problem or corrupted data. Just want to know whether or not there is a way to tell my extracted data is correct. Any opinions? For example, I write the query based on customer's requirement and follow all the rules to extract the data from database. So, if someone ask me this question, what is the best way to address this type of question?
June 16, 2009 at 8:25 pm
Define what you mean by 'correct'.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
June 16, 2009 at 8:32 pm
correct -customer expected data, eg 1000 records in a report.
Not the content of the data.
June 17, 2009 at 9:32 am
If it is row counts that you are concernted about, you could create an audit/history table of sorts that you can use to insert 1 row for each time your SSIS package executes. Use SSIS to capture the row counts for rows output to your destination file and then include the row counts as a column in your table.
If it is more than row counts that make the data 'correct', you'll need to define what those requirements are.
June 17, 2009 at 1:24 pm
What happens if the resulting number of rows after applying selection criteria will be less than 1000?
Would that make the result incorrect?
Sounds like a very strange requirement/criteria to me...
To "prove" that the SSIS package returns all rows that match the search criteria you could run the select statement used in your SSIS package directly against the database using SSMS (if that's possible) and compare the results. Is that what you're looking for?
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply