How should I use SQL to normalize my tables?

Question

How should I use SQL to normalize my tables?

Mr_X

SSC-Addicted

Points: 401
More actions
June 14, 2022 at 2:15 pm

#4049324
This is coming from a novice in SQL with only a few months experience. I have a table that I would like to normalize. I plan to split the table into four tables. I have a plan of how I want to do this. I'll outline how I think I should go about doing this, please correct me if I am wrong or going in the wrong direction.
1. import the table in SSMS.
2. Use SQL to create a table using columns for table 1.
3. Use SQL to create a table using columns for table 2.
4. Use SQL to create a table using columns for table 3.
5. Use SQL to create a table using columns for table 4.
6. Drop the original table.
7. Create relationships for tables 1,2,3 and 4.
Thanks in advance

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

ratbak Ten Centuries Points: 1358 More actions · Answer 1

That's an ELT (Extract, Load, & Transform) strategy using a staging table, as opposed to an ETL strategy (in which you would do the transformation before importing, loading directly into the destination tables). Both are valid & common. The ELT strategy takes advantage of the power/features of SQL Server & TSQL. ETL strategy tend to use tools like SSIS or similar 3rd-party tools to transform the data.

One recommendation: Swap steps 6 & 7 -- do not drop your original table until you have successfully created foreign keys (relationships) & other constraints to ensure your data and data model are valid.

Steve Collins SSCrazy Points: 2157 More actions · Answer 2

The tricky part of normalization is populating the tables imo. A lot of times it's an attempt to establish foreign key referential integrity post priori (after the fact). You could divide your tables into two categories: first, look up tables (without any fks), and second, tables with fks. To populate the lookup tables: assign a unique primary key on INSERT of distinct values. To INSERT into foreign key columns JOIN on the lookup value(s) to SELECT the primary key of the lookup table

This reply was modified 2 years, 5 months ago by Steve Collins.

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Mr_X SSC-Addicted Points: 401 More actions · Answer 3

@ratbak Just curious if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?. Good advice I'll be sure to drop the table at the very end. Thanks.

Steve Collins I'm embarrassed to say I've not used lookup tables in SQL yet. But I'll be sure to learn it and do it. It's definitely a method I should know. Thanks.

This reply was modified 2 years, 5 months ago by Mr_X.
This reply was modified 2 years, 5 months ago by Mr_X.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 4

I would do the normalization on paper first. That will give you a chance to see if you are going through the normalization process correctly without wasting effort in building your tables on the fly. Doing this will also allow you to create the DRI while creating the tables and help you see the steps for importing the data from the source table to the destination tables.

Mr_X SSC-Addicted Points: 401 More actions · Answer 5

My fault, I should have been more specific. The steps outlined above are assuming I know what the model will look (so after I've done the normalization on paper). Out of curiosity, what does DRI stand for?.

Also, if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?.

Phil Parkin SSC Guru Points: 246960 More actions · Answer 6

Mr_X wrote:

My fault, I should have been more specific. The steps outlined above are assuming I know what the model will look (so after I've done the normalization on paper). Out of curiosity, what does DRI stand for?.
Also, if I did follow an ETL in SQL would I import the table, transform it using SQL and then create a new table/s to insert the values in?.

As suggested above by Lynn Pettis, the normalisation should be done on paper first. Then you build your (initially empty) tables based on what you have decided.

After that, build a process which moves the data into the new structures. The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )

DRI stands for declarative referential integrity. It refers to the use of primary and foreign keys to maintain data integrity.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Mr_X SSC-Addicted Points: 401 More actions · Answer 7

I appreciate your help. You mentioned:

"The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )"

Does that mean that the transformation usually takes place outside of SQL? If it were to take place in SQL does that mean the table is imported, transformed and then inserted into the normalized tables?.

Thank you

This reply was modified 2 years, 5 months ago by Mr_X.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 8

Mr_X wrote:

I appreciate your help. You mentioned:
"The process which transforms your data should be kept separate from any scripts you use to build your database objects (tables, keys, foreign key relationships, constraints ... )"
Does that mean that the transformation usually takes place outside of SQL? If it were to take place in SQL does that mean the table is imported, transformed and then inserted into the normalized tables?.
Thank you

All depends on how you set up the import. If you stage the data in a staging table, then you are going to use an ELT process and you will do your transformations using SQL.

Either way, you want to be sure to populate the parent table(s) first, then the child tables moving down each level in order. That is why you want to do the normalization on paper so you can determine the order of the imports.