What it means to atomize values in 1st normal form

Question

What it means to atomize values in 1st normal form

michael.leach2015

SSCommitted

Points: 1597
More actions
February 4, 2019 at 7:25 pm

#382667

Suppose we have the following table. We have everything for the address packed into one cell. Also, we have two credit cards packed into one cell.
To solve the problem with the address being packed into one cell, we can break things out by creating a new column for each element:
For the two values in the credit card field if we added two credit card columns we would have two repeating groups of columns which violates the 1NF (or does that violate the 2NF)? So we create another table with credit card info:
My questions are these:
1. When we break out the address info, is this an example of atomizing the data?
2. When we break out the 2 credit card numbers, is this an example of atomizing the data?
In either case we are breaking things out to a more granular level so by that account, I would say yes to both questions. On the other hand, one situation results in a new table being created, and the other does not. So by that account I would say only one of those situations counts as atomizing the data.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Sue_H SSC Guru Points: 90860 More actions · Answer 1

michael.leach2015 - Monday, February 4, 2019 7:25 PM
Suppose we have the following table. We have everything for the address packed into one cell. Also, we have two credit cards packed into one cell.
To solve the problem with the address being packed into one cell, we can break things out by creating a new column for each element:
For the two values in the credit card field if we added two credit card columns we would have two repeating groups of columns which violates the 1NF (or does that violate the 2NF)? So we create another table with credit card info:
My questions are these:
1. When we break out the address info, is this an example of atomizing the data?
2. When we break out the 2 credit card numbers, is this an example of atomizing the data?
In either case we are breaking things out to a more granular level so by that account, I would say yes to both questions. On the other hand, one situation results in a new table being created, and the other does not. So by that account I would say only one of those situations counts as atomizing the data.

It depends. It might be safer to say you're decomposing the data.
Atomic values is an area that is debated due to some differences in how it's defined, interpreted. This is an interesting read on some of the theoretical to practical views:
What Is the Actual Definition of First Normal Form (1NF)?

Sue

michael.leach2015 SSCommitted Points: 1597 More actions · Answer 2

Sue_H - Monday, February 4, 2019 9:29 PM
michael.leach2015 - Monday, February 4, 2019 7:25 PM
Suppose we have the following table. We have everything for the address packed into one cell. Also, we have two credit cards packed into one cell.
To solve the problem with the address being packed into one cell, we can break things out by creating a new column for each element:
For the two values in the credit card field if we added two credit card columns we would have two repeating groups of columns which violates the 1NF (or does that violate the 2NF)? So we create another table with credit card info:
My questions are these:
1. When we break out the address info, is this an example of atomizing the data?
2. When we break out the 2 credit card numbers, is this an example of atomizing the data?
In either case we are breaking things out to a more granular level so by that account, I would say yes to both questions. On the other hand, one situation results in a new table being created, and the other does not. So by that account I would say only one of those situations counts as atomizing the data.
It depends. It might be safer to say you're decomposing the data.
Atomic values is an area that is debated due to some differences in how it's defined, interpreted. This is an interesting read on some of the theoretical to practical views:
What Is the Actual Definition of First Normal Form (1NF)?
Sue

Thank you for the link. After viewing various videos and reading various websites, it seems that there is conflicting or ambiguous information on the definitions of each normal form. What is the governing body that establishes the rules of each normal form, i.e. what is the definitive source for defining normal forms?

Sue_H SSC Guru Points: 90860 More actions · Answer 3

michael.leach2015 - Monday, February 4, 2019 10:50 PM
Sue_H - Monday, February 4, 2019 9:29 PM
michael.leach2015 - Monday, February 4, 2019 7:25 PM
Suppose we have the following table. We have everything for the address packed into one cell. Also, we have two credit cards packed into one cell.
To solve the problem with the address being packed into one cell, we can break things out by creating a new column for each element:
For the two values in the credit card field if we added two credit card columns we would have two repeating groups of columns which violates the 1NF (or does that violate the 2NF)? So we create another table with credit card info:
My questions are these:
1. When we break out the address info, is this an example of atomizing the data?
2. When we break out the 2 credit card numbers, is this an example of atomizing the data?
In either case we are breaking things out to a more granular level so by that account, I would say yes to both questions. On the other hand, one situation results in a new table being created, and the other does not. So by that account I would say only one of those situations counts as atomizing the data.
It depends. It might be safer to say you're decomposing the data.
Atomic values is an area that is debated due to some differences in how it's defined, interpreted. This is an interesting read on some of the theoretical to practical views:
What Is the Actual Definition of First Normal Form (1NF)?
Sue
Thank you for the link. After viewing various videos and reading various websites, it seems that there is conflicting or ambiguous information on the definitions of each normal form. What is the governing body that establishes the rules of each normal form, i.e. what is the definitive source for defining normal forms?

The basics of the different norm forms are usually the same. In your case, Not having repeating groups, having a primary key, single values are generally agreed upon parts of the definition. Your example having two credit card numbers is probably a "less debatable" one to break down - you have multiple credit card numbers. In terms of addresses, that can vary depending on to what degree you break things down. It's only when you get towards the more theoretical areas where things start being debated.
Atomizing data would be one where things can be debated. That's why I would call it decomposing, just to avoid the debates. Not that the debates are all bad - we can all learn things from debates. And sometimes the best way to understand our own point of view is to understand the opposing points of view. At other times, all the debating can keep things from moving forward or from going anywhere. Or situations where whoever talks the most or the loudest is the one who is heard - which doesn't make that view correct.

Sue

Alan Burstein SSC Guru Points: 61141 More actions · Answer 4

michael.leach2015 - Monday, February 4, 2019 10:50 PM
Sue_H - Monday, February 4, 2019 9:29 PM
michael.leach2015 - Monday, February 4, 2019 7:25 PM
Suppose we have the following table. We have everything for the address packed into one cell. Also, we have two credit cards packed into one cell.
To solve the problem with the address being packed into one cell, we can break things out by creating a new column for each element:
For the two values in the credit card field if we added two credit card columns we would have two repeating groups of columns which violates the 1NF (or does that violate the 2NF)? So we create another table with credit card info:
My questions are these:
1. When we break out the address info, is this an example of atomizing the data?
2. When we break out the 2 credit card numbers, is this an example of atomizing the data?
In either case we are breaking things out to a more granular level so by that account, I would say yes to both questions. On the other hand, one situation results in a new table being created, and the other does not. So by that account I would say only one of those situations counts as atomizing the data.
It depends. It might be safer to say you're decomposing the data.
Atomic values is an area that is debated due to some differences in how it's defined, interpreted. This is an interesting read on some of the theoretical to practical views:
What Is the Actual Definition of First Normal Form (1NF)?
Sue
Thank you for the link. After viewing various videos and reading various websites, it seems that there is conflicting or ambiguous information on the definitions of each normal form. What is the governing body that establishes the rules of each normal form, i.e. what is the definitive source for defining normal forms?

What Sue_H said is pretty spot on IMHO -

"The basics of the different norm forms are usually the same. In your case, Not having repeating groups, having a primary key, single values are generally agreed upon parts of the definition. "

I don't normally recommend Wikipedia pages but I think the Database Normalization page and Wiki about the Relational Model are pretty spot-on. This Geeks for Geeks page is good too.

Note, too, a common violation of Normal Form is when people create IDs that look like this: AB-123555F. The AB represents something like a category, 1235555 is a unique number and F represents the 6th version of that Id. In this case AB-123555F is not atomic.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 5

michael.leach2015 - Monday, February 4, 2019 7:25 PM

>> Suppose we have the following table. We have everything for the address packed into one cell [sic]. Also, we have two credit cards packed into one cell [sic].<<

Please follow basic netiquette and post DDL instead of pictures. We get very very tired of transcribing from colored printouts into the DDL you should've done for us. Does your boss make you program from drawings on the back of cocktail napkins?

>> 1. When we break out the address info, is this an example of atomizing the data? <<
I never heard the term "atomize" in RDBMS. Did you mean normalize?

>> 2. When we break out the 2 credit card numbers, is this an example of atomizing the data? <<

Let's get some basic definitions. To be "atomic", a data element must have no meaningful decomposition. For example, splitting out a year, month and day from a date destroys its atomicity. This is why those three sub-elements are called "fields" in the SQL standards, as opposed to columns or attributes. In the relational database model, a data element has to be scalar. That means it's measured on a scale and has a single dimension (if you want to really get into measurement theory, it is mathematical and boring, but I mention it in my books).

Scalar and atomic are different. The pair (longitude, latitude) is atomic; either one of the two elements by itself has destroyed information about the location. But both of them are scalar, measured on a ratio scale.

In the case of your credit card numbers, they are distinct entities. They are not values of one entity. They should not be in the same row but attached to the customer or the account via a REFERENCES clause to a second table.

Please post DDL and follow ANSI/ISO standards when asking for help.

RonKyle SSC-Dedicated Points: 31557 More actions · Answer 6

Atomicity can be very dependent on the purpose of the database. In your address example, normally the dwelling number and street address are safely left together. But I had one database where the dwelling numbers where separate. This allowed for easy querying of number of houses on a given street. That was a specialized database, however, but the point remains. Normalizing doesn't take place in a vacuum. You have got to have data to review and you have to know the purposes for which the users need the data.