June 14, 2022 at 2:15 pm
This is coming from a novice in SQL with only a few months experience. I have a table that I would like to normalize. I plan to split the table into four tables. I have a plan of how I want to do this. I'll outline how I think I should go about doing this, please correct me if I am wrong or going in the wrong direction.
Thanks in advance
June 14, 2022 at 2:43 pm
That's an ELT (Extract, Load, & Transform) strategy using a staging table, as opposed to an ETL strategy (in which you would do the transformation before importing, loading directly into the destination tables). Both are valid & common. The ELT strategy takes advantage of the power/features of SQL Server & TSQL. ETL strategy tend to use tools like SSIS or similar 3rd-party tools to transform the data.
One recommendation: Swap steps 6 & 7 -- do not drop your original table until you have successfully created foreign keys (relationships) & other constraints to ensure your data and data model are valid.
June 14, 2022 at 2:58 pm
The tricky part of normalization is populating the tables imo. A lot of times it's an attempt to establish foreign key referential integrity post priori (after the fact). You could divide your tables into two categories: first, look up tables (without any fks), and second, tables with fks. To populate the lookup tables: assign a unique primary key on INSERT of distinct values. To INSERT into foreign key columns JOIN on the lookup value(s) to SELECT the primary key of the lookup table
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
June 14, 2022 at 4:32 pm
@ratbak Just curious if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?. Good advice I'll be sure to drop the table at the very end. Thanks.
Steve Collins I'm embarrassed to say I've not used lookup tables in SQL yet. But I'll be sure to learn it and do it. It's definitely a method I should know. Thanks.
June 14, 2022 at 10:27 pm
I would do the normalization on paper first. That will give you a chance to see if you are going through the normalization process correctly without wasting effort in building your tables on the fly. Doing this will also allow you to create the DRI while creating the tables and help you see the steps for importing the data from the source table to the destination tables.
June 15, 2022 at 7:18 am
My fault, I should have been more specific. The steps outlined above are assuming I know what the model will look (so after I've done the normalization on paper). Out of curiosity, what does DRI stand for?.
Also, if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?.
June 15, 2022 at 12:29 pm
My fault, I should have been more specific. The steps outlined above are assuming I know what the model will look (so after I've done the normalization on paper). Out of curiosity, what does DRI stand for?.
Also, if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?.
As suggested above by Lynn Pettis, the normalisation should be done on paper first. Then you build your (initially empty) tables based on what you have decided.
After that, build a process which moves the data into the new structures. The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )
DRI stands for declarative referential integrity. It refers to the use of primary and foreign keys to maintain data integrity.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
June 15, 2022 at 7:14 pm
I appreciate your help. You mentioned:
"The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )"
Does that mean that the transformation usually takes place outside of SQL? If it were to take place in SQL does that mean the table is imported, transformed and then inserted into the normalized tables?.
Thank you
June 15, 2022 at 9:18 pm
I appreciate your help. You mentioned:
"The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )"
Does that mean that the transformation usually takes place outside of SQL? If it were to take place in SQL does that mean the table is imported, transformed and then inserted into the normalized tables?.
Thank you
All depends on how you set up the import. If you stage the data in a staging table, then you are going to use an ELT process and you will do your transformations using SQL.
Either way, you want to be sure to populate the parent table(s) first, then the child tables moving down each level in order. That is why you want to do the normalization on paper so you can determine the order of the imports.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply