Handling overridable warnings in the database layer (Article)

Description :: Are you sure?

First, let us review why this topic even matters. In certain circles, the view is held that all data validation and integrity constraints should be handled by the database server itself. Why trust every application that ever touches it to know all the rules? Why trust yourself, when running one-off updates, not to mess anything up? All sorts of checking can be done: check constraints and domains can ensure that individual fields and entire rows are valid; traditional and on-commit triggers can add further validation across tables; every once in a while a trigger may have to make sure locks are taken on rows to make sure two transactions, working together, can't circumvent these rules. Custom exception messages can be sent back to the user when an error is encountered (no matter what application they're using at the moment,) and you'll be sure the transaction can't be committed until the errors are corrected. It's really quite wonderful; I recommend you not let yourself be talked out of it.

To be fair, there are some downsides to this, and they'll apply to other stuff below as well. While having all the constraints in the database is great for making sure it stays clean, it's not so great for letting your users know they're about to make a mistake: you only know data's invalid when you actually attempt to save it. If you want to give them feedback as they're typing, you'll have to be constantly trying to save the data (and roll back every attempt, even if successful) just to see if it would fail, theoretically, if the user were to finish and save. That's not so great. It's obvious why a lot of applications instead choose to put all the validation client-side: they only have to write it once, and that one time gives them a handy way of informing the user as they go as to the validity of the data. That's great for users, but now it's bad for the database -- nobody's protecting the data from direct access to the database; when the client application gets rewritten, who's to say the rules will be kept intact? The other problem with database constraints is that they tend to be immediate; the first error is the only one anyone sees. You have to keep trying, after correcting each error, until you eventually succeed.

So what's all this about warnings, then?

As a cautious (paranoid) database administrator, you've been careful to implement all of your constraints in the database. You took the time to create domains (in Firebird) so that similar fields could share constraints; you bravely created CHECK constraints everywhere you could; your primary, unique, and foreign key definitions are all properly formalized; you even went the extra step to define some triggers here and there to monitor for odd data changes that CHECK constraints wouldn't have noticed; some triggers ensure that locks are held to prevent concurrent transactions from performing similar tasks; in fact, you were so cautious that you even read up on deferred constraints or on-commit triggers and implemented some funky multi-table constraints! You're a big-folk among small-folk.

And yet, somehow, you feel you haven't done enough. You realize now that not everything is about hard constraints -- there are also soft constraints, warnings and reminders that users should ponder before they commit. The whole point of putting the constraints in the database was to ensure that changes to the middle and client tiers wouldn't risk populating the database with bad data; it centralized control and provided a single interface in which to define rules. You hated the idea of middleware or client systems (which you don't control) validating data and passing it off to your database as if it were no more than a persistence layer, a mere indexed file! But now ... now you must deal with warnings. How do you prompt your users to confirm their actions? You have no tools for this, there are no abstractions for it! Should you relinquish control? Should the client layer be responsible for identifying troublesome requests? Clients are easily circumvented! Should the middle tier be responsible then? You've cleaned out all the other business logic from it; all those rules are now triggers or constraints, the middle tier is nothing but a cacheing and proxying service ... is it not an ugly hack to add warnings at that layer?

First, you must decide what degree of safety you want on warnings. Should the middle or client tiers be able to override a warning without the user ever seeing it? Should they be able to override it without seeing it themselves? Should they find out about warnings right away, or only when they're ready? Should they be able to ask for a warning check, or will the database always check on its own schedule? You can't be too picky; these are, after all, only warnings. Anyone who writes a layer with the express intent of ignoring a warning does so at their own peril; if you're in an open environment though, it may still be important to make sure they don't do so accidentally (or too easily.) For that, you'll want to make sure at least that the middle or client tiers must see the warning, and expressly ignore it.

Let's start with a base example, preventing out-of-range body measurements:

create trigger warning1 for bodymeasures before insert or update as
begin
 if (new.weight_lbs > 9000.0)
  exception 'Over nine thousand pounds? Really?';
 if (new.height_feet < 1.0)
  exception 'Under a foot? Really?';
end

Figure 1. Triggers raising exceptions

We've had this trigger in place for a while, to prevent clearly-wrong values. Even a baby's going to be over a foot in length at birth, right? The only problem we've had is that it only returns one error at a time, and it'd be nice to tell the user about everything that's wrong. So although it could be fine to leave this as a hard constraint, we could still benefit from queuing up the errors for display.

create global temporary table warnings (
 warning_name varchar(1000);
);

create trigger warning1 for bodymeasures before insert or update as
begin
 if (new.weight_lbs > 9000.0) then
  insert into warnings (warning_name) values ('Over nine thousand pounds? Really?');
 if (new.height_feet < 1.0) then
  insert into warnings (warning_name) values ('Under a foot? Really?');
end

create trigger hard_warnings on transaction commit as
begin
 if (exists(select * from warnings)) then
  exception 'Those were not mere warnings!';
end

grant insert, select on warnings to trigger warning1;
grant select on warnings to trigger hard_warnings;

Figure 2. Triggers setting warnings

If your database product doesn't support on-commit transactions, you can instead carefully call a stored procedure, with the same effect, before any attempt to .commit() but you lose a little bit of safety-net. Now we need to be able to override these things.

create global temporary table warnings (
 warning_name varchar(1000);
 warning_code varchar(50);
);

create global temporary table overrides (
 override_code varchar(50);
);

create procedure check_warning (warning_name varchar(1000), warning_code(50)) as
begin
 insert into warnings (warning_name, warning_code) values (:warning_name, :warning_code);
end

create trigger warning1 for bodymeasures before insert or update as
begin
 if (new.weight_lbs > 9000.0) then
  execute procedure check_warning ('Over nine thousand pounds? Really?', 'weight9000');
 if (new.height_feet < 1.0) then
  execute procedure check_warning ('Under a foot? Really?', 'height1');
end

create trigger hard_warnings on transaction commit as
begin
 if (exists(select * from warnings 
            left join overrides on overrides.override_code = warnings.warning_code
            where overrides.override_code is null)) then
  exception 'Those were not mere warnings!';
end

grant select on warnings to endusers;
grant all on overrides to endusers;
grant insert, select on warnings to trigger procedure check_warning;
grant execute on procedure check_warning to trigger warning1;
grant select on warnings to trigger hard_warnings;
grant select on overrides to trigger hard_warnings;

Figure 3. Overridable warnings

How would the application use this? When saving changes from the user, it would proceed as usual: perform inserts and updates and deletes as requested. It would then check the warnings table to see if it had tripped any wires. It could then blindly attempt to commit, and if the warning exception were thrown, return to the user the list of warnings it encountered.

Ideally, the user would then check boxes, press buttons, or otherwise acknowledge the various warnings, and try again; this time, the inserts, updates, and deletes performed by the application layer would include inserts into the overrides table, one per warning the user acknowledged. As long as the user acknowledged all the warnings, and the data hasn't changed enough in the mean time to create new problems, the transaction can now commit. (I'm assuming a stateless interaction, as through a website, but a client application could choose to keep the transaction open after it failed to commit the first time, ask the user to acknowledge the warnings, insert the overrides, and then attempt to commit again, without redoing any of the data inserts/updates/deletes, and without losing any locks they may have been holding.)

I was too lazy in figure 3 to put proper constraints on the warnings and overrides tables; if you were to set NULL warning_code in a trigger, it would be impossible to override. That might be what you want, if you're just trying to queue up all errors to display to the user, but not let them override. But you'll probably want to be a little more disciplined about it, maybe properly splitting those up between warning and error tables, etc.

Another upgrade would be the ability to disable the deferred warnings, particularly when running batch jobs that can't afford to wait before failing, or when trying to diagnose exactly where in your application code there's bad data being fed into your database. It's easy enough to find if the exception is thrown immediately, but if it waits until you try to commit, you'll be cursing. A context variable could work nicely:

alter procedure check_warning (warning_name varchar(1000), warning_code(50)) as
begin
 if (rdb$get_context('USER_TRANSACTION', 'no_defer_warn') is not null) then
  insert into warnings (warning_name, warning_code) values (:warning_name, :warning_code);
 else
  exception :warning_name;
end

Figure 4. Adding the ability to disable deferred constraints or warnings

A quick clean-up is in order, too, so you can code for the specific warning-related error in your application code:

create exception e_lacking_override 'There were warnings which you did not override';

alter trigger hard_warnings on transaction commit as
begin
 if (exists(select * from warnings 
            left join overrides on overrides.override_code = warnings.warning_code
            where overrides.override_code is null)) then
  exception e_lacking_override;
end

Figure 5. Using proper exceptions

Another extension would be auditing: every time an override is used, store that somewhere for later examination. It might be good to track both overrides and the warnings they were used on, particularly to look for applications that like to set overrides that aren't even needed (indicating someone got lazy with their code and tried to run the blockade.) I used global temporary tables in my example, because they cleanly separate data with a lifetime of the transaction. But nothing prevents you from having a process in the on-commit trigger that matches overrides and warnings up to each other, then copies them into audit tables.

You could add fields to track the number of times a warning was raised, and force the user to check the box for "do this for the next 1,147 warnings". You could generate different warning codes by combining data from the rows themselves into the codes, something like 'over9000.' || new.id, thereby preventing applications from blindly overriding. They'd at least have to select out the warnings and push them back in. This works if you're using deterministic ID fields (e.g. your application layer prefetches row numbers from a sequence or generator, and will re-use the same ID when re-attempting the insert/update later) or if your application layer keeps the transaction alive while asking about overrides; you could base the code on non-increment fields, too, as long as they're stable (nothing based on CURRENT_TIMESTAMP, session ID, transaction ID, or any other nondeterministic expression) and you take into account any NULLs that may be present and allowed.

From there, it gets wackier if you want to make sure the end-user actually saw and responded to the warnings, because the database administrator doesn't trust the application programmer: now you need private/public key encryption, the database has to store the users' public keys, users have to encrypt the warning codes with their private keys before responding, and the application layer itself does little more than shuttle data back and forth, unable to short-circuit the process. But that's probably complete overkill. Not to speak of what the users will do to you when you announce your plan!

This solution still won't give you deferrable constraints in a database that doesn't support them (for things like foreign key constraints, in particular.) And because it requires a round-trip to the database to discover problems, it won't give you in-page warnings as the user types. But that wasn't the point, was it? For some of these warnings, you might need heavy processing power anyway: checking that the patient's weight is within 2 standard deviations from the mean (to avoid silly data-entry mistakes), checking that a password doesn't contain any dictionary words, checking for soft-duplicate warnings where a unique constraint would be overkill, etc.

Continued at top

Extras