Understanding Postgres Durability Options

Most people tend to think of Postgres as a very conservative piece of software, one designed to “Not Lose Your Data”. This reputation is probably warranted, but the other side of that coin is that Postgres also suffers when it comes to performance because it chooses to be safe with your data out of the box. While a lot of systems tend to side towards being “fast by default”, and leaving durability needs as an exercise to the user, the Postgres community takes the opposite approach. I think I heard it put once as “We care more about your data than our benchmarks”. That said, Postgres does have several options that can be used for performance gains in the face of durability tradeoffs. While only you can know the right mix for your particular data needs, it’s worth reviewing and understanding the options available. “by default” - OK, this isn’t a real setting, but you should understand that, by default, Postgres will work to ensure full ACID guarantees, and more specifically that any data that is part of a COMMIT is immediately synched to disk. This is of course the slowest option you can chose, but given it’s also a popular code path the postgres devs have worked hard to optimize this scenario. “synchronous commits” - By default synchronous_commit is turned on, meaning all commits are fsyncd to disk as they happen. The first trade-off of durability for performance should start here. Turning off synchronous commits introduces a window between when the client is notified of commit success, and when the data is truly pushed to disk. In affect, it let’s the database cheat a little. The key to this parameter is that, while you might introduce data loss, you would never introduce data corruption. Since it tends to produce significantly faster operations for write based workloads, many people find that is a durability tradeoff they are willing to make. As an added bonus, if you think that most of your code could take advantage of this but you have some certain part of your system that you can’t afford the tradeoff, this setting can be set per transaction, so you can ensure durability in the specific cases where you need it. That level of fined grained control is pretty awesome. “delayed commits” - Similar sounding in theory to synchronous_commit, the settings for “commit_siblings” and “commit_delay” try to provide “grouped commits”, meaning multiple transactions are committed with a single fsync() call. While this certainly has the possibility of increasing performance in a heavily loaded system, when the system is not loaded these will actually slow down commits, and that overall lack of granularity compared to synchronous_commit usually means you should favor turning off synchronous_commit and bypass these settings when trading off durability for performance. “non-synching” - Fsync was the original parameter for durability vs performance tradeoffs, and it can still be useful in some environments today. When turned off, postgres throws out all logic of synchronizing write activity with command input. This does mean that running in this mode, in the event of hardware or server failure, you can end up with corrupt, not just missing, but corrupt data. In many cases this might not happen, or might happen in an area that does matter (say a corrupt index, that you can just REINDEX), but it could also happen within a system catalog, which can be disastrous. This leads many a Postgres DBA to tell you to never turn this off, but I’d say ignore that advice and evaluate things based on the tradeoffs of durability vs performance that are right for you. Consider this; if you have a standby set up (WAL based, Slony, Bucardo, etc…), and you are designing for a low MTTR, chances are in most cases hardware failure on the primary will lead to a near immediate switch to the standby anyway, so a corrupt database that you have already moved beyond will be irrelevant to your operations. This assumes that you can afford to lose some data, but if you are using asynchronous replication, you’ve already come to that conclusion. Of course, you are giving up single node durability, which might not be worth the tradeoffs in performance, especially since you can get most of the performance improvements with turning off synchronous_commits. In some situations you might fly in the face of conventional wisdom and turn off fsync in production, but leave it on in development; imagine an architecture where you’ve built redundancy on top of ec2 (so a server crash means a lost node), but you are developing on a desktop machine where you don’t want to have to rebuild in the case of a power failure, and don’t want to run multiple nodes. Life is a series of tradeoffs between cost and efficiency, and Postgres tries to give you the flexibility you need to adjust to fit your particular situation. If you are setting up a new system, take a moment to think about the needs of your data. And before you replace Postgres with a new system, verify what durability guarantees that new system is giving you; it might be easier to set Postgres to something comparable. If you are trying to find the right balance on your own situation, please feel free to post your situation in the comments, and I’ll be happy to try to address it.