In this example, consider the Boxes table, shown in Listing 1, which our application needs to populate.
Our application has already loaded some data into our table, as represented by the script shown in Listing 2.
However, suppose that we then develop a new version
of our application, in which we have started to enforce the following
rule when inserting rows into the Boxes table:
The height of a box must be less than, or equal to, the width; and the width must be less than, or equal to, the length.
At some later point, we are asked to develop a query
that returns all the boxes with at least one dimension that is greater
than 4 inches. With our new business rule in place we know (or at least
we think we know) that the longest dimension of any box is the length,
so all we have to do in our query is check for boxes with a length of
more than 4 inches. Listing 3 meets these requirements.
Unfortunately, we have failed to ensure that our
existing data meets our business rule. This query will not return the
existing row, even though its largest dimension is 5 inches.
As usual, we can either eliminate our assumption,
which will involve writing a more complex query that does not rely on
it, or we can clean up our data, assume that it will stay clean, and
leave our query alone. Unfortunately, the assumption that the data will
"stay clean" is a dangerous one, when enforcing data integrity rules in
the application. Our application may have bugs and, in some cases, may
fail to enforce the rule. Some clients may continue to run the old
version of the application, which does not enforce the new business rule
at all. Some data may be loaded by means other than the application,
such as through SSMS, therefore bypassing the rule enforcement
altogether. All too many developers completely overlook these
possibilities, assuming that enforcing business rules only in the
application is safe. In reality, data integrity logic housed in the
application layer is frequently bypassed.
As a result, it is quite possible that we will have data in the Boxes
table that does not meet our business rule, and that we're likely to
have to repeat any "data clean up" process many times. Some shops run
such data clean-ups weekly or even daily. In short, although we can use
our applications to enforce our data integrity rules, and although this
may seem to be the fastest way to get things done in the short term, it
is an approach that is inefficient in the long run.
Most of the arguments covered here may also apply to
enforcing data integrity logic in stored procedures, unless you are able
to implement a design whereby access of your stored procedure layer is
enforced, by forbidding all direct table access.
Over the following sections, we'll discuss
how to use constraints and triggers, which are usually the preferred
ways to protect data integrity.