* Need a working PostgreSQL setup with archive_mode = on. We’re using 8.3.1 on Solaris 10 / u4, all on top of [http://en.wikipedia.org/wiki/ZFS ZFS].
* Issue a pg_start_backup() in postgres
* Do a zfs snapshot of each filesystem (all tablespaces, and be sure to include all of $PGDATA, including all xlogs)
* Issue a pg_stop_backup() in postgres
* Ship the snapshots over to a back up machine (ideally somewhere remote, possibly on tape)
* Wait 3 days for the last step to finish [[image /xzilla/templates/default/img/emoticons/smile.png alt=”:-)” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]]
I had mentioned to a few people our TB+ disaster recovery scheme at the PG-East conference last week, with hopes that we would be doing a full on recovery test in early April. Lucky for us, we’ve been able to do a rough run through, so I wanted to report some results. First, a quick recap of why most of the common [http://www.postgresql.org/docs/current/interactive/backup.html backup solutions] suck for our needs: * pg_dump is pretty much a joke with 1TB+ of data, and especially on our system which has constant data churn, and enough mutating schema to make getting a consistent snapshot unworkable. * pitr would be nice for failover, but it isn’t a real disaster recovery system. The key problems are issue with either corrupted xlogs making thier way to the slave, or data corruption issues getting propogated into your “backup”. If you don’t have a static snapshot, you can hose yourself in some un-fun ways. * slony (or bucardo, or other replication systems) also suffers from the issue of data corruption getting propogated onto your slaves, with no method to get back to a legitimate copy of your work. Again, this is fine when trying to solve failover, but not always the right answer for backups. So, what we need is to make a copy of the database, and stick that some place safe and secure, so in case something goes horribly wrong (for **really** scary versions of horribly), we can get back to data that we know is good. The basic scheme goes like this: