April 3, 2006 at 4:51 am
I work for a company that certifies individuals in the use of Software such as Databases, Microsoft office, etc.
The programmering team of the company, is developing an in-house new OLTP system (on-line automated examination system). However I have reasons to doubt for the competency of the programming Team that is building the OLTP system. So far, i have encounter a mediocre database design of the OLTP system, and several data issues.
I am responsible for building the MIS for this system (which MIS in future will integrate data from other organiazational systems such as ERP, CRM, etc).
I want to prevent any of their problems (that they may encounter) to emerge in my MIS system as well. In other words i want my MIS system to be as independent of their mistakes as possible. So now that i am starting to build their MIS system, and we concluded that because of the problems i encounter with inaccurate data and numerous other problems, i came up with two options.
1) ask them to export to me the necessary info in the form of views and then load them directly to my dimention and fact tables, or,
2) ask them to export the data in ascii files, and then load them in the data staging area.
Correct me if i am wrong but in the case of the first option, i ommit the datastaging area, plus I will have no control if they load inaccurate data.. Is this correct? would i possibly encounter problems in the later integration of the ERP andd CRM data to the MIS?
I believe that the second case is more cautious.
What is your suggestion? please justify your opinion, as your justificacion, will lead my way, perhaps even more than your final answer.
Thank you in advance,
April 3, 2006 at 5:16 am
If the data that is coming out of the examination system contains rubbish as a result of bad design, that should not be your problem. You will end up building all sorts of extra validation, filtering and error-checking routines to try and mop up the problems, but you won't get them all.
The net result - your project takes longer than it should. Data quality may be OK, but it won't be perfect. Things will run slower and less efficiently than they should. The designers of the exam system will get away with poor database design, to your cost.
So I would recommend making a big fuss now and getting the exam system fixed up. But I guess that recommendation is out of scope of your post
If you're not in a position to do that, I agree that you should have a 'staging' area as you suggest in your option 2. It affords you the opportunity to do some data-fixing work before it gets into your MIS.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 3, 2006 at 5:36 am
Phil,
thank you very much for your time...
My IT manager has actually made the fuss you are talking about. He has also communicated the problems to the CIO of the company, and many problems regarding the database design and the data consistence of the OLTP system have come to light, and the programmers have commited in fixing them. The problem is that as you probably guessed we have to build the MIS without waiting for the programming team to fix their database. (yes I believe they will escape eventually, as you mention). So that is why i have concluded to the 2 options I mentioned before. My only thought is that if i get ascii files in a datastaging area i will "double" the total loading time of cubes, on the other hand i am afraid that if i dont, i may encounter prblems in the future.
I will follow your advice and since (theoretically at least) things will get fixed up, i will continue with the "views" option as you mention.
Kind Regards,
Dionisis
April 3, 2006 at 6:04 am
I guess the good thing about using views is that they are always up-to-date. You can just grab the data out of their database whenever you want (subject to load on the server) - and you can still perform validation on the way to your MIS. Forgive any lack of knowledge here - I'm not a BI dude.
Eg
VIEW -> staging area (DTS / stored procs) -> validation & clean-up (use DTS / stored procedures / .NET / whatever) -> MIS
What I'm saying is that I don't think that you lose anything by using a view, as long as data do not go directly from the view to your MIS.
--edit-- one thing you do lose, perhaps, is a level of auditability of the data. By keeping a history of CSV files, you may be better equipped to answer questions like "When did that data item get in there?". Not that you can't keep a history of data brought into your staging area - but that is data that you have extracted, not data that has been provided to you - a subtle difference.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 3, 2006 at 8:06 am
Dionisis,
Good Luck!!!!! I've been involved with way too many of these situations. Best advice I can give you is to make your own database schema and make an interface to bridge the gap. Do not let them define the interface. And of course, always log changes.. you can never have enough cya.
Kevin
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply