Constraints are robust but, as we've discussed, they
are often not suitable for implementing more complex data integrity
rules. When such requirements arise, many developers turn to triggers.
Triggers allow a lot of flexibility; we can tuck pretty much any code
into the body of a trigger. Also, in most cases (though not all, as we
will see) triggers automatically fire when we modify data.
However, triggers do have limitations with regard to
what functionality can be achieved, and are also hard to code, and
therefore prone to weaknesses. As such, they are the cause of many
common data integrity issues. Some of the typical data integrity
problems related to triggers are as follows:
some triggers falsely assume that only one row at a time is inserted/updated/deleted
some triggers falsely assume that the primary key columns will never be modified
under some circumstances, triggers do not fire
triggers may undo changes made by other triggers
some triggers do not work under snapshot isolation levels.
Some of these problems can be fixed by improving the
triggers. However, not all of these problems mean that the trigger was
poorly coded – some are inherent limitations of triggers in general. For
example, in some cases the database engine does not fire a trigger, and
there is nothing we can change in the trigger to fix that problem.
We'll discuss each of these problems in detail over the coming sections.
Problems with multi-row modifications
In the following example, our goal is to record in a "change log" table any updates made to an item's Barcode. Listing 1 creates the change log table, ItemBarcodeChangeLog. Note that there is no FOREIGN KEY on purpose, because the change log has to be kept even after an item has been removed.
The FOR UPDATE trigger shown in Listing 2 is designed to populate the ItemBarcodeChangeLog table, whenever a barcode is updated. When an UPDATE statement runs against the Items table, the trigger reads the Barcode value as it existed before the update, from the deleted virtual table, and stores it in a variable. It then reads the post-update Barcode value from the inserted virtual table and compares the two values. If the values are different, it logs the change in ItemBarcodeChangeLog. I have added a lot of debugging output, to make it easier to understand how it works.
Listing 3 demonstrates how this trigger works when we perform a single-row update.
Our trigger works for single-row updates, but how does it handle multi-row updates? Listing 4 empties the change log table, adds one more row to the Items table, updates two rows in the Items table, and then interrogates the log table, dbo.ItemBarcodeChangeLog, to see what has been saved.
Our trigger does not handle the multi-row update properly; it silently inserts only one row
into the log table. Note that I say "inserts only one row," rather than
"logs only one change." The difference is important: if we modify two
or more rows, there is no guarantee that our trigger will record the OldBarcode and NewBarcode values associated with a single modified row. When we update more than one row, both the inserted and deleted virtual tables have more than one row, as shown by the debugging output in Listing 4.
The SELECT that populates the OldBarcode variable in our trigger will randomly pick one of the two values, 123457 or 234567, listed in the "debugging output: data before update" section. The SELECT that populates NewBarcode works in the same way; it can choose either 1234579 or 2345679. In this case, it happens that the OldBarcode and NewBarcode
do come from one and the same modified row, and so the net effect is
that the trigger appears to log only one of the updates, albeit
correctly. In fact, this was just chance; it could equally well have
taken the OldBarcode from one row and the NewBarcode from the other, the net effect being an erroneous, single log record.
In short, this logic used in this trigger does not
work for multi-row updates; it contains a "hidden" assumption that only
one row at a time will be updated. We cannot easily get rid of that
incorrect assumption; in fact, since enforcing the assumption does not
seem feasible in this situation, we need to rewrite the trigger from
scratch in order to remove it, as shown in Listing 5. This time, rather than store the old and new values in variables, we use the inserted and deleted
virtual tables directly, and then populate the change log table via a
set-based query that joins those virtual tables, and correctly handles
multi-row updates.
Rerunning Listing 4 verifies that our altered trigger now handles multi-row updates.
The first lesson here is that, when developing
triggers, the defensive programmer should always use proper set-based
logic, rather than iterative logic.