Proper Use of Check Constraints and Enums
Something that comes up from time to time is the proper use of Oracle column check constraints or MySQL enum type columns. Typically the question is when it’s appropriate to use those features to enforce data integrity as opposed to having foreign keys to a lookup table.
For example, here is a table that uses MySQL enumeration values to enforce that the the log message level is either “debug”, “warning”, or “error”:
CREATE TABLE log_messages
(id INTEGER NOT NULL
,message_level ENUM(‘debug’, ‘warning’, ‘error’)
,message VARCHAR(255)
,create_ts TIMESTAMP default sysdate() NOT NULL
,update_ts TIMESTAMP)
The first consideration is whether the enumeration of message types is locked down to those three for the lifetime of the application. In the above case, that seems perfectly within the bounds of solid design practices. There might be one or two conceivable states added, but that’s what keeps developers and DBAs employed. On the other hand, if the list of types will grow gradually there’s a problem, as in the below table:
CREATE TABLE log_messages
(id INTEGER NOT NULL
,message_level ENUM(‘debug’, ‘warning’, ‘error’)
,job_name enum(‘send_emails’, ‘clean_orders’, ‘calc_revenue’)
,message VARCHAR(255)
,create_ts TIMESTAMP default sysdate() NOT NULL
,update_ts TIMESTAMP)
It should be clear that the list of jobs will gradually grow to include ones that you couldn’t conceive of when you designed the table. It’s certainly quicker to use an enumeration here, but in the long run it’s going to cause headaches. The preferred redesign involves 2 tables:
CREATE TABLE log_messages
(id INTEGER NOT NULL
,message_level ENUM(‘debug’, ‘warning’, ‘error’)
,job_id INTEGER NOT NULL
,message VARCHAR(255)
,create_ts TIMESTAMP default sysdate() NOT NULL
,update_ts TIMESTAMP)
CREATE TABLE jobs
(id INTEGER NOT NULL
,name VARCHAR(20)
,create_ts TIMESTAMP default sysdate() NOT NULL
,update_ts TIMESTAMP)
Where log_messages.job_id = jobs.id. Of course, this points out one more problem with the overuse of check constraints or enumerations. What happens if you want to reuse that column value in another table as a foreign key? For instance, you might have another table that stores the total run time for each job. You would have to add the same constraint/enum on that other table, and make sure they stay in sync as the system evolves. This duplication should raise the hackles of any experienced developer.
So the rule of thumb is that it should be used sparingly, in cases where the enumeration values are unlikely to change frequently, or in places where you are sure you’ll never need to reuse the values as foreign keys in other tables. One of my favorite places to use this is for boolean yes/no columns such as:
CREATE TABLE admins
(id INTEGER
,name VARCHAR(255)
,is_active ENUM(‘y’, ‘n’) default ‘y’ NOT NULL
,create_ts TIMESTAMP default sysdate() NOT NULL
,update_ts TIMESTAMP)
Obviously there would be little reason to extend the ‘y’ and ‘n’ values. And there would not be a meaningful reason to externalize those values so they could be reused in other tables.
Something to Consider
Keep in mind that the MySQL enum type has some other features that might come in handy, such as the ability to index each enumeration value using a number. That may also influence your decision to use it in places where you otherwise shouldn’t depending on the application
design.