PostgreSQL 8.3 Features: Enum Datatype

I’ve never been a big fan of enum types in databases. In my experience, they are often used in places where other datatypes are more appropriate (true/false text vs. boolean), or used in ways that a better served with either a more relational design (a list of options that should be in a lookup table, so that additions are dml operations, not ddl operations), or a more object-relational design, such as using a domain, so that the constraints are declared in a single place and used consistently throught the database. So when the enum type feature was discussed for 8.3, I didn’t get too excited, but I also didn’t object because I knew there was at least 1 use case that it would solve that isn’t handled well by the other methods available. In the old version of the pagila (the sample dvd store database), the film table had a text column name rating, used to hold the ratings for all of the movies in our store. Since movie ratings are a well defined list, we used a text constrint to enforece that any entries into the table were legit.
pagila=# d film.rating Table "public.film" Column | Type | Modifiers --------+------+------------------ rating | text | default 'G'::text Check constraints: "film_rating_check" CHECK (rating = 'G'::text OR rating = 'PG'::text OR rating = 'PG-13'::text OR rating = 'R'::text OR rating = 'NC-17'::text)
This worked fine for preserving data integrity, and if that was all that was needed I’d have no problem using this in a schema. Of course, if this definition were going to be re-used a lot, we’d want to make it a first class object, specifically a domain defined by the same constraint. That too would be a perfectly valid way to handle this information; one I’d certainly have no quarrel with. But of course this is a blog post about 8.3 and enums, so I’m going to do it with enums and show you why this is even niftier [[image /xzilla/templates/default/img/emoticons/smile.png alt=”:-)” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]]. Step 1 in using any type of enum is creating a custom type with an enum definition. Slightly different than how some other databases implement enums, where enums are often defined “on-the-fly” during table creation, in PostgreSQL you define them as first-class types. The syntax is pretty straightforward:
pagila=# create type mpaa_rating as enum('G','PG','PG-13','R','NC-17'); CREATE TYPE
Once you have this setup, you can use your enum type just like any other type (text, integer, boolean, etc…). In this case, we’ll use it to define our rating column in our film table:
pagila=# d film.rating Table "public.film" Column | Type | Modifiers --------+-------------+------------------------- rating | mpaa_rating | default 'G'::mpaa_rating
And just for good measure, we toss in a default value, specifying with a cast that it is a value of our mpaa_rating type. Of course, the beauty of this isn’t really evident until you start playing with the data. Remember the use case I hinted at above? What I was reffering to was enum’s ability to use the order of the type definition as a custom ordering for our enum type. For example, this allows us to do the following:
pagila=# select rating, count(*) from film group by rating order by rating; rating | count --------+------- G | 178 PG | 194 PG-13 | 223 R | 195 NC-17 | 210 (5 rows)
Compare that to if we were using text fields:
pagila=# select rating, count(*) from film group by rating order by rating::text; rating | count --------+------- G | 178 NC-17 | 210 PG | 194 PG-13 | 223 R | 195 (5 rows)
For movie ratings, the custom order we have defined has a direct correlation to the data in the column itself, and the enum type allows us to capture this meta information inside your database. To be clear (since I know other database behave less consistently in this area), this works not just for ordering, but in all levels of comparisons and operators. For example, if I want to know how many movies the store has that my kids can watch, I can craft an appropriate query like so:
pagila=# select count(*) from film where rating < 'PG-13'; count ------- 372 (1 row)
Nice eh? So remember, if you need to keep track of true and false, use a boolean! But if you have a use case where you need to define a fairly static list of items, and custom ordering would help you capture the true meaning of your data, 8.3’s enum types are certainly worth checking out.