When you work at a [http://www.omniti.com/ technology consulting firm], you’re going to eventually be involved in fixing someone’s broken database. Disaster recovery is one of those things that get’s easier with experience, but of course if you’re doing things correctly, those chances don’t come up that often. (“Luckily” that state of commodity hardware still give’s us some opportunities…)
One way to gain experience in this area is to abuse the hell out of your development systems. That’s not to mean that you should try and break things on purpose, but occasionally mucking around with the underside of things can sometimes be an interesting experience. Of course, sometimes abusing your database can cause you trouble…
We have a development server where we’ve been playing with [https://labs.omniti.com/trac/reconnoiter/ reconnoiter], churning up lots and lots of data. Well, after ignoring the box a little bit last week, we managed to fill up the machine and run things completely out of disk space, crashing our database, and making it impossible to restart… “whoops”.
What we were aiming for really was to get the database running, delete some data, and then go back to ignoring the machine [[image /xzilla/templates/default/img/emoticons/smile.png alt=”:-)” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]] (it is a dev box after all). What made this particular instance more interesting is that this was in a zfs zone, with an underlying snapshot in place, which meant that the few attempts we made at removing logfiles and such ended up being pretty fruitless. The other half of that was that we didn’t really have much on the machine that was safe to delete, pretty much everything was tied up in the database.
Not wanting to struggle through finding appropriate files to delete, I came up with the brilliant idea to free up space within the database would be to just delete the template1 database on disk. The idea here is really simple… you remove the files on disk, this free’s up enough space to start the database, you can then log in, drop template1 and recreate it using template0, and your back in business. In most cases this is a one shot deal, because when a database cluster is first initialized, the template0 and template1 databases always get the same oid’s; one you drop and recreate a template database, the oid changes, and unless you’ve mapped it out ahead of time you don’t really know which db maps to which oid. For example, most $PGDATA would look something like this:
bash-3.00$ du -hs /pgdata/base/*
4.8M /pgdata/main/base/1
4.7M /pgdata/main/base/11510
4.9M /pgdata/main/base/11511
4.0G /pgdata/main/base/58326
3K /pgdata/main/base/pgsql_tmp
the numbers at the end of the directories map to oid’s of specific database. the first two are the template databases, 11511 will map to the postgres database in any cluster (provided 8.3 and you have dropped/recreated it), and then higher oid’s map to other databases you created. So, feeling good about myself, I went ahead and did rm -rf on /pgdata/main/base/11510, started up the database, freed up a bunch of space, and then proceeded to drop template1, recreate it from template0, and went on about my business. Or so I thought…
Funny thing about the above is that, as it turns out, oid 1 actually maps to template1 in 8.3, and oid 11510 maps to template0. Granted, this doesn’t match with the idea that template0 is created first, and template1 is made from that, but really this is the case. This means that when I did the rm -rf above, I really wiped out template0, not template1. And when I dropped template1 above, I was actually dropping the in tact template database (doh!). Surprisingly, when I did my create database template1 with template template0, the create database command never complained that it didn’t actually have any files to copy… I guess this isn’t a common scenario people find themselves in much. [[image /xzilla/templates/default/img/emoticons/smile.png alt=”:-)” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]] Note for the future; if your $PGDATA ever looks like this, you’re in trouble:
bash-3.00$ du -hs /data/main/base/*
5.7M /data/main/base/11511
24G /data/main/base/16385
4.9M /data/main/base/17008
4.8M /data/main/base/18219
4.7M /data/main/base/33439
1K /data/main/base/pgsql_tmp
bash-3.00$ psql -c "select oid, datname from pg_database"
oid | datname
-------+-------------
11510 | template0
11511 | postgres
16385 | reconnoiter
17008 | pagila
49797 | template1
18219 | bbpg
33439 | mysqlcompat
(7 rows)
When you have fewer database directories than you have entries in pg_database, that’s a problem. When the numbers in your system catalogs don’t match up with those on disk, that too, is a problem.
The right answer here is to run initdb, dump out the databases you care about, and import them into your fresh, clean, working database cluster. But of course, if I chose the right answer all the time, where would I get all these entertaining blog posts? Besides, what are dev machines for, if not for abusing?
So, what I did instead was run initdb in a spare directory, and then just did the following
bash-3.00$ cp -r /data/spare/base/11510/ /data/main/base/
bash-3.00$ cp -r /data/spare/base/11510/ /data/main/base/49797/
bash-3.00$ du -hs /data/main/base/*
4.7M /data/main/base/11510
5.7M /data/main/base/11511
25G /data/main/base/16385
4.9M /data/main/base/17008
4.8M /data/main/base/18219
4.7M /data/main/base/33439
4.7M /data/main/base/49797
1K /data/main/base/pgsql_tmp
Once I had the files in place, I was able to connect to the databases, run vacuums, and create new databases using template1, so it would seem everything is back to normal. (For now at least [[image /xzilla/templates/default/img/emoticons/wink.png alt=”;-)” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]] )