November 3, 2016 at 5:27 pm
Hi all,
I have general question, I have bunch of file/processes like 30 of them that we need to load into db, last years we basically didn't have any constraints so I just made sure in my last step before loading that all basic PK - FK are in place in sql ( have only orders with product_ID which exist in product_master for customers who exists in Cust_Master, etc...)
This year we got new regulated db with all constraints defined so I'm thinking how to do it, I'm thinking:
1. Drop all constraints
2. Truncate tables (we need this step as could be updates, rerun)
3. Load data ( * I think it's better as we don't care about orders for those tables 30+ of them and can use existing package for load)
4. Restore constraints (** and address if any problem exists
__ a. add PK
__ b. add FK
I appreciate your comments, how you do this ?
Best
Mario
November 3, 2016 at 6:37 pm
You don't want to remove your primary key or clustered index from your permanent table.
Whenever I load data from flat files, I load the data into a staging table. If it's a huge load, it can be loaded into a temp table or "scratch database" that's in simple recovery mode. Because the data is in a temp table, it can be loaded and validated before it's loaded to the permanent table. You can update your temp table to define your foreign key values, apply any cleaning rules you have and selectively write the invalid data to "failure" tables if you need to. In short, you can operate on however you need to in order to ensure the data is good.
Then, once you're satisfied, you can load it to your permanent tables.
You could remove your foreign key constraints to help speed up inserts, but that introduces the risk of inserting invalid values and then not being able to recreate them. By using the staging table approach, you can validate before you load, insert rows in lookup tables, etc. If you think it'll be a big enough improvement, test using both approaches. Whatever you do, you don't want to end up with invalid data in your permanent table.
November 3, 2016 at 7:04 pm
Tx Ed,
I do have temp/working db without any PK/FK where I put my files.
So I'll keep PK in target db, with FK I think I go with DROP/CREATE FK, too much work to change order otherwise.
M
November 3, 2016 at 8:31 pm
If these tables are of any size, definitely consider loading them with at least the Clustered Index already in place. To do it quickly and without blowing the log file up to a ridiculous size, consider "Minimal Logging" and, possibly, the use of Trace Flag 610.
During a recent demonstration (precon with Ed as co presenter), I built a 7 million row, 7.3GB table with the CI already in place and the log file never grew beyond 200MB. That was WITHOUT Trace Flag 610. Please see the following article for more on how to load massive amounts of data.
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx
--Jeff Moden
Change is inevitable... Change for the better is not.
November 4, 2016 at 10:38 am
Thanks JEff and all.
I go thru your materials.
M
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply