June 6, 2007 at 5:10 am
Hi,
What is mean by Data Cleansing ? Give me some approaches to handle Data Cleansing ?
Tell me the step by step processs to do data cleansing ? or give me some tips about Data Cleansing .
Regards
Karthik
karthik
June 6, 2007 at 5:45 am
There is no universal answer to this, but basically data cleansing is getting the individual pieces of data to look the way you want, consistently. With that said, it typically doesn't mean that the data is correct, just that it look the same as the other data in the same columns.
Cleansing of data could mean anything from formatting phone-numbers consistently to splitting a single address field out into its multiple components, with just about everything else imaginable in between. It can even include dictionary lookups, custom table lookups, etc. At some places, it's perfectly acceptable to hire people to do this manually, while other situations will need an automated solution that might run daily, or even more often.
One approach to handle the process is to find out what the business wants cleaned. Other times, you'll be able to see the problem with the data yourself, and might proactively take care of it, assuming that that is acceptable in your environment.
In other words, you seem to be looking for a solution to a problem that may or may not exist, and if it does, may or may not be well-defined. Determine if there are issues, and then determine what they are, and you'll be able to tackle them at that point.
June 6, 2007 at 12:55 pm
Presumably you are doing this in the context of a data warehouse. If so, you should be aware that this is almost always the most complicated and time intensive part of the process. As already noted, this is everything from eliminating duplicate information to standardizing the meaning of data to correcting incorrectly entered information. If not done properly, however, you risk providing the decision makers with false information. Ensure you allocate time to do this properly.
June 6, 2007 at 4:56 pm
I'm thinking it's a test question...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 7, 2007 at 5:27 am
I usually use a toothbrush and shoe polish. If you work your way from heavy polish to the finer grades, you will have very shiny data when you are done.
June 7, 2007 at 8:05 am
I prefer Comet - it is chlorinated.
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
June 7, 2007 at 8:06 am
a relevant google result:
http://en.wikipedia.org/wiki/Data_cleansing
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply