These rules were formulated by E.F.Codd and published in 1985 1). They describe what a
relational database system must support in order to call itself relational.
So, without further introduction, let's dive into the gospel of relational databases!
1. Information Rule
Data is presented only in one way. As values in columns in rows. Simple, consistent and versatile.
A table (aka an entity or relation) is a logical grouping of related data in columns and rows.
Each row (aka record or tuple) represents a single fact and each row contains information about just one
fact. Each column (aka field or attribute) describes a single property of an object. Each value (datum) is
defined by the intersection of column and row.
2. Guaranteed Access Rule
Each value can be accessed by specifying table name, primary key and column name. This way each and
every value is uniquely identifiable and accessible.
Basically this is a restatement of the fundamental requirement for primary keys.
3. Systematic treatment of Null values
A fully relational database system must offer a systematic way to handle missing information.
Null is always treated as unknown. Null does mean no value or the absence of a value. Because no value
was entered, it follows that it is unknown. The information is missing. Null is not the same as an empty
string or 0. Each value, Null included, compared with Null, is Null.
4. Dynamic relational online catalogue
In addition to user defined data a relational database contains also data about itself. So there are
two kinds of tables.
- user-defined
- system-defined
Metadata is data which describe the structure of the database, its objects and how they are related.
This catalogue is integrant part of the database and can be queried by authorized users just like any other
table. Another name for this online catalogue is system catalogue or data dictionary.
5. Comprehensive Data Sublanguage Rule
Codd's intention was to have at least one language to communicate with the database. This language should
be capable to handle data definition, data manipulation, authorization, integrity constraints and transactional
processing. It can be used both interactively and embedded within applications.
Although SQL is not the only data query language, it is by far the most common one. SQL is a linear,
non-procedural or declarative language. It allows the user to state what he wants from the database,
without explicitly stating where to find the data or how to retrieve the data.
6. View Updating Rule
When presenting data to the user, a relational database should not be limited
to tables. Views are 'virtual tables' or abstractions of the source tables. They react like tables with
the one exception that they are dynamically created when the query is executed. Defining a view does
not duplicate data. They are current at runtime.
All theoretically updateable views should be updateable by the system. If data is changed in a view, it
should also be changed in the underlying table. Updateable views are not always possible. For example there is a
problem when a view addresses only that part of a table that includes no candidate key. This could mean that
updates could cause entity integrity violations. Some sources on the internet state that 'Codd himself did not
fully understand this'. I haven't found any rationale for this.
7. High-level Insert, Update and Delete
A relational database system must support basic relational algebraic operations (Selection, Projection and
Joins) as well as set operations like Union, Intersection, Division and Difference. Rows are treated like
sets for data manipulation. Set operations and relational algebra are used to create new relations by
operations on other tables.
8. Physical Data Independence
The physical layer of the architecture is mapped onto the logical. Users and applications do not depend
upon the physical structure of a database. Implementation of the physical layer is job of the storage engine
of a RDBMS. The relational engine communicates with the relational without any interaction by the user.
An application that queries data from a relational database does not need to know, how this data is physically
stored. It only sends the data request and leaves the rest to the RDBMS. Applications should not be
logically impaired when the physical storage or access methods change.
9. Logical Data Independence
Users and applications are to a certain degree independent of the logical structure of a database. The
logical structure can be modified without redeveloping the database and/or the application. The relations
between tables can change without affecting the functionality of applications or ad-hoc queries.
10. Integrity Independence
To be viewed as a relational database the RDBMS must implement data integrity as an internal part of the
database. This is not the job of the application. Data integrity enforces consistence and correctness of the
data in the database. Simply put, it keeps the garbage out of the database. Changes to integrity constraints
should not have an affect on applications. This simplifies applications, but is not always possible.
11. Distribution Independence
The data manipulation (sub)language of the RDBMS should be able to work with distributed databases.
Views should be able to join data from tables on different servers (distributed queries) as well as from
different RDBMS (heterogeneous queries). The user should not have to be aware of whether a database is
distributed or not.
12. Nonsubversion Rule
If the RDBMS supports a low-level (single record at a time) language, this low-level language should not
be used to bypass and/or subvert data integrity that are expressed in the high-level (multiple records at a
time) relational language.
0. Foundation Rule
Interestingly Codd defined a Rule 0 for relational database systems.
"For any system that is advertised as, or claimed to be, a relational database management system,
that system must be able to manage databases entirely through its relational capabilities, no
matter what additional capabilities the system may support." (Codd, 1990)
That means, no matter what additional features a relational database might support, in order to be
truly
called relational it must comply with the 12 rules.
Codd added this rule in 1990. Also he expanded this 12 rules to 18 to include rules on catalogs, data
types (domains), authorization and other. 2)
Codd himself had to admit the fact that based on the above rules there is no fully relational database
system available. This has not changed since 1990. To be more specific rules 6, 9, 10, 11 and 12 seem to be
difficult to satisfy.
REFERENCES:
1)Codd, E.F. "Is Your DBMS Really Relational?" and "Does Your DBMS Run By the
Rules?" ComputerWorld, October 14 1985 and October 21 1985.
2)Codd, E.F. The Relational Model for Database Management, Version 2;
Addison-Wesley; 1990.