October 9, 2007 at 10:26 am
Hello,
Here is the gist of the scenario we are dealing with:
We have a online query that comes to our database that is handled through Biztalk. The answer data needs to come from a sqldatabase. The sqldatabase resides on the local sql instance to the Biztalk server. However the sqldatabase is a data mart that needs to get is daily data from the 'datawarehouse'. The datawarehouse in turn has data updates to it daily - it is a migration/transformation process from C_ISAM system on unix. The whole migration process takes about 4 hours which is about the down time we have for these online requests. The consultant who is writing the program to handle the requests is recommending that we create a new database(dynamically with a datatime stamp in the name) and delete the old one on completion. The references to these database and which one is active will be recorded in a separate database. I would like to know if anyone had done this kind of thing before and has any pros and cons for this sort of process
Thanks
KR
October 9, 2007 at 11:57 am
Greetings,
----> Based solely on what you wrote <----
Fire that consultant!
I worked on a few (ok...many) projects "bringing" data from mainframe/unix/flat files/etc to a SQL Server/DB2/oracle database used for datawarehouse and in each of those projects we used a staging database, then moved the data to the datamarts.
In some cases, we were updating the datamarts. Insome other cases, we were dropping and recreating the datamarts.
Here, the key word is S-T-A-G-I-N-G database. We never had to face a biblical multiplication of the bread and fish...oups....of the databases. 😀
One question I have regarding the scenario you outlined: If the database name changes each day, will the front-end application 9Whatever it is reports, fat client, even Excel) of the datawarehouse will have to be reconfigured each time to point to the right database name?
Thank you
MBA
MCSE, MCDBA, MCSD, MCITP, IBM DB2 Expert, I-Net+, CIW
Proud member of the NRA
-Anti-gun laws prevent law abiding citizens to buy guns and defend themselves against bad guys who do not care about the law and get their gun illegally.
- Democracy is 2 wolves and one sheep talking about their next dinner. Freedom is 2 wolves and one armed sheep with a .357 magnum talking about their next dinner.
October 9, 2007 at 12:06 pm
We do have a staging database. It is after all the transformation and cleanup the data is moved to the data warehouse. We are creating this new datamart since it will need just the subset of data for a specific period of time and we have time limits on the response time for the queries
To your other question as to whether the app would know how to know which database it will point to - the proposed design is something like this:
A configuaration database will keep track of the 'active'database and app will send it's request based on that.
Thanks
KR
October 9, 2007 at 12:15 pm
Greetings,
Hmmm....If it is a star schema, I don't see much performance issue why you cannot have all your subsets and query by periods. Especially that the periods definitions are indexed.
Maybe I do not understand well your problem, but I am convinced it is not something "special" that needs such a radical approach.
I remember one client where we had all the insurance claims for the last 15 years for the different lines of business, sliced by different periods, regions, meteorology, amounts, credit score, etc... it was HUGE and our queries were almost all sub-seconds with only a few taking 2-3 seconds.
Thank you
MBA
MCSE, MCDBA, MCSD, MCITP, IBM DB2 Expert, I-Net+, CIW
Proud member of the NRA
-Anti-gun laws prevent law abiding citizens to buy guns and defend themselves against bad guys who do not care about the law and get their gun illegally.
- Democracy is 2 wolves and one sheep talking about their next dinner. Freedom is 2 wolves and one armed sheep with a .357 magnum talking about their next dinner.
October 9, 2007 at 12:26 pm
The main issue is that the 'datawarehouse' has not been really designed well. It is in its infancy and still undergoing data discovery. We are working on it one portion at a time ( no point saying it should have gone through thorough desing and then created, it is what is it due to decisions made, but we currently have to deal with the data as it is right now. Also this database will reside on a different server than the datawarehouse.
Thanks
KR
October 10, 2007 at 4:05 am
Hi
What would be the size (approx) of the db that you plan to create daily ?
"Keep Trying"
October 10, 2007 at 7:46 am
I am still evaluating , but it is estimated to be around 1000 MB.
KR
October 10, 2007 at 8:41 am
Why would that take that long? 1000 mb is not very much anymore.
October 10, 2007 at 9:37 am
1000mb is the final subset. The time that I am talking about is for the tool to migrate C_ISAM system to a staging sql database, do data cleanup, tranformations, normalizations, rebuild tables for reporting purposes, create indexes etc for the datawarehouse. The final moving of the subset of the data into the daily database should hardly take any time at all.
I am beginning to feel that this is a complex scenario that someone from the outside will have a hard time grasping all the intricacies. I could not possibly explain all the details in a thread. I gues I will need to evaluate this one myself.
Thanks for all the input
KR
October 10, 2007 at 11:56 am
If you really feel the need to build a new database everyday, it would be simpler to just drop the old database when the load is done, and rename the new one to the same name every day.
[font="Courier New"]use master
alter database MyDatabase set offline with rollback immediate
drop database MyDatabase
alter database MyDatabase_Load_New modify name = MyDatabase[/font]
October 10, 2007 at 12:03 pm
Definitley simpler, will keep that in mind
Thanks much
KR
October 10, 2007 at 1:10 pm
I've seen previous instances of consultants dropping the entire database and reconstructing it because the problem of solving processing updates can take a long time, and depending on the cost of the consultant it could be more cost effective to do this. The primary downside is that the larger the sources database(s) grow, the longer the download will take. It is ultimately better to handle the download properly, but the other way isn't a bad start either.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply