Contrary to popular opinion, the forms are not a
progressive methodology, but they do represent a progressive level of
compliance. Technically, you can't be in 2NF until 1NF has been met.
Don't plan to design an entity and move it through a first normal form
to a second normal form, and so on. Each normal form is simply a
different type of data integrity fault to be avoided.
First Normal Form (1NF)
The first normalized form means the data is in an entity format, such that the following three conditions are met:
- Every unit of data is represented within scalar attributes. A scalar value is a value “capable of being represented by a point on a scale,” according to Merriam-Webster.
Every attribute must contain one unit of
data, and each unit of data must fill one attribute. Designs that embed
multiple pieces of information within an attribute violate the first
normal form. Likewise, if multiple attributes must be combined in some
way to determine a single unit of data, the attribute design is
incomplete.
- All data must be represented in unique attributes. Each
attribute must have a unique name and a unique purpose. An entity
should have no repeating attributes. If the attributes repeat, or the
entity is wide, the object is too broadly designed.
A design that repeats attributes, such as an order entity that includes item1, item2, and item3 attributes to hold multiple line items, violates the first normal form.
- All data must be represented within unique tuples. If the entity design requires or permits duplicate tuples, that design violates the first normal form.
If the design requires multiple tuples to
represent a single item, or multiple items are represented by a single
tuple, the table violates first normal form.
For an example of the first normal form in
action, assume that you have a listing of customers and each customer
can have multiple phone numbers. Table 1 shows customer data in a model that violates the first normal form. The repeating phone number attribute is not unique.
Table 1 Violating the First Normal Form
To redesign the data model so that it complies
with the first normal form, resolve the repeating group of phone number
attributes into a single unique attribute, as shown in Table 2, and then move any multiple values to a unique tuple. The Customer entity contains a unique tuple for each customer, and the PhoneNumber entity's CustomerID refers to the primary key in the Customer entity.
Table 2 Conforming to the First Normal Form
Another example of a data structure that
desperately needs to adhere to the first normal form is a corporate
product code that embeds the department, model, color, size, and so
forth within the code. I've even seen product codes that were so
complex they included digits to signify the syntax for the following
digits.
In a theoretical sense, this type of design is
wrong because the attribute isn't a scalar value. In practical terms,
it has the following problems:
- Using a digit or two for each data element means that the database will soon run out of possible data values.
- Databases don't index based on the internal values of a string, so
searches require scanning the entire table and parsing each value.
- Business rules are difficult to code and enforce.
Entities with nonscalar attributes need
to be completely redesigned so that each individual data attribute has
its own attribute. Smart keys may be useful for humans, but it is best
if it is generated by combining data from the tables.