Verifying a master-detail relationship (Article)

Description :: Using on-commit triggers to do what other constraints can't

It can be a vexing problem: your database contains a simple master/detail relationship, and you want to make absolutely sure that a transaction is never committed with a master who has no detail records.

The master and detail records get inserted separately, not by a single monolithic stored procedure that can check everything. In a different world, perhaps such stored procedures would be common; most database products don't let you define sufficiently complex stored procedures (allowing arrays of records as parameters, for example,) and even if they did, just imagine a relational database in which you're not allowed to simply insert into the right tables because of the difficulty of constraint checking? To be forced to call a huge stored procedure that takes care of inserting into master and detail tables, performing any required updates and deletions along the way, then checking that everything is okay? It's not just annoying, it's relationally ugly!

The detail table contains a foreign key to the master table, and it's not-null, so you must insert the master record first, which means a trigger on the master record won't be able to see any detail records, and will always fail. Maybe this constraint can be relaxed a bit: you could remove the foreign key constraint, pre-select the master record's primary key value, insert the detail records with the expected value of the master record's primary key, then insert the master -- the trigger in the master would now notice the child record. But you've now lost a foreign key constraint, allowing all sorts of alternate problems, and you can't replace it with a check constraint because it would still detect the problem and fail. (Replacing a foreign key constraint with a check constraint can be useful in a few circumstances, such as confusing the server into not noticing record locks on master records when inserting into detail tables, but it doesn't help us here.)

This also means that any application (or user) interacting with these tables must keep this entire process in mind, and perform it in exactly the right order; accidentally inserting the master record first would result in failure. This would also make bulk imports more complicated: if you're assigning new primary key values to imported master records, you must do that first (without inserting into the master table), then insert the detail records, then insert the master records.

And what about deletions from the detail table? For the method above to work, any change to the detail table would need to perform an update on the master record so its insert-or-update trigger would fire and re-check that child records are present. That would result in a record lock on the master record anytime a child record changes, which isn't ideal, but as we'll see with my solution below, that's not something we can necessarily solve.

There may also be situations of mutual-overlap and circular-dependency (but without getting into the realm of circular-dependency among foreign key constraints, as those get far nastier) where no clear order of events emerges that would allow the trigger method to work. I'll leave inventing those as an exercise for the reader.

Another solution to explore is the use of check constraints that perform a sub-query; I hear they're not recommended, as they only fire when that row itself is touched. In theory though, you could place a check constraint on the master table that requires that a detail record exist (add constraint ... check (exists(select ... from detail_table where ...))) and triggers on detail tables to make sure that constraint fires every time a change is made to detail tables, but this would still check the constraint immediately, causing problems with the initial insert. It's essentially just another way of stating the same constraint you'd put in an on-update-or-insert trigger on the master table, and suffers from the same problems.

Rather than getting rid of foreign key constraints, creating monolithic stored procedures, or inventing complicated insertion strategies, what if we could use the database to perform a simple constraint check for us the way we do with everything else?

The code below assumes you have access to features from Firebird 2.1. It uses global temporary tables (transaction-scoped) and on-commit triggers, along with regular features -- stored procedures, triggers, and exceptions.

If any changes are made to a master record or a detail record, the master is added to a list of masters to check before committing this transaction. This prevents us from rechecking all masters every time we commit, and being forced to fix data that we didn't touch -- this could be important if you're creating a new rule but haven't fixed old data. The procedures to check an individual master record or the whole set of masters we've touched are available to call individually; this means that your code inserting into these tables can periodically check its progress to make sure it won't run into unexpected problems when committing, yet forgetting to do so won't compromise database integrity. You need to keep in mind that with this solution, you may not run into any exceptions at all until you go to commit your transaction; if your code assumes that SQL exceptions can only be thrown during insert/update/delete/call operations, you'll get a nasty surprise.

 create global temporary table masters_touched
 (
  id integer not null
 );

 CREATE EXCEPTION bad_master 'exception message';

 SET TERM ^ ;
 CREATE PROCEDURE check_master 
  (master_id integer not null) 
 AS 
  declare variable details_count integer = 0;
 BEGIN
  select count(*) 
  from detail_table 
  where master_id = :master_id
   into :details_count;
  if (details_count < 1) then 
   exception bad_master 'Master ' || :master_id || ' needs at least one detail record';
 END^
 SET TERM ; ^

 SET TERM ^ ;
 CREATE TRIGGER trigger_check_master FOR master_table
  AFTER INSERT OR UPDATE
 AS 
 BEGIN
  insert into masters_touched (id) values (new.ID);
 END^
 SET TERM ; ^

 SET TERM ^ ;
 CREATE TRIGGER trigger_check_detail FOR detail_table
  AFTER INSERT OR UPDATE OR DELETE
 AS 
 BEGIN
  if (inserting or updating) then 
   insert into masters_touched (id) values (new.master_id);
  else 
   insert into masters_touched (id) values (old.master_id);
 END^
 SET TERM ; ^

 SET TERM ^ ;
 CREATE PROCEDURE check_masters
 AS 
 declare variable master_id integer;
 BEGIN
  for select id from masters_touched into :master_id do
   execute procedure check_master(master_id);
 END^
 SET TERM ; ^

 SET TERM ^ ;
 CREATE TRIGGER check_masters_on_commit on transaction commit
 AS 
 BEGIN 
  execute procedure check_masters;
 END^
 SET TERM ; ^

One additional problem: if two transactions simultaneously make changes that, individually, are fine, but together are not, our solution won't notice. If a master record had two detail records, and two concurrent transations each deleted just one of those detail records, each on-commit trigger will see one record remaining, and the test will pass, allowing the commit to succeed. In the end, however, you'll be left with zero detail records. As seemingly always with these situations, the solution is record locking; rules related to an item should probably lock that item to make sure they're seeing the whole picture; regardless of why the master record is being re-checked, lock it. You can do that in the various triggers (updating a detail record would lock the parent) or in the checking procedure, it doesn't much matter -- the earlier you do it, the earlier the offending (losing) transaction will be notified. If you wait to do it in the on-commit trigger, you pretty much guarantee that the first transaction to attempt to commit wins the competition. That's up to you.

 ... (at top of check_master procedure)
 update master_table set id = id where id = :master_id;
 if (row_count < 1) then exit;
 ...

Checking row_count is intended to protect you from the case where you attempt to delete the master and its detail rows, all at once; its primary key value will be in the temporary table, the stored procedure will be called, and you don't want to fail the commit simply because no, there aren't any more detail records for it, but that's because the master is gone too!

And if you feel like cleaning things up a bit, to avoid re-checking records you've already checked, consider adding a distinct to that select in procedure check_masters, and/or only inserting into the temp table if the value isn't already there; you could also remove entries from the temporary table after the records have been checked, so an application that checks records as it goes won't require the system to recheck all of them on commit; they would only be rechecked if they had been touched since the last verification. That could save time. Don't remove them if you throw an exception, however, as the application could ignore the exception it received from an ad-hoc check, then continue and commit anyway.

Any time you mess with Firebird on-commit triggers (on-connect as well, I would think,) remember that you have isql -nodbtriggers available to you, just in case you manage to create or alter a trigger such that it always throws an exception; re-altering the trigger will require you to be able to commit, and your existing (bad) trigger will throw an exception before you can get a chance to replace it with one that won't. The no-db-trigger mode will let you modify the trigger, commit, then go back to your favorite schema-editing tool.

Continued at top