This got it down to about 10 hours to do a 30GB databases, which quite frankly seemed awfully slow. More to the point, it was really beyond the amount of downtime we could accept for this app, especially when you consider we still had to allow for time to dump out the old database. We were actually considering using Slony at one point, but that seemed like a dubious path for this system since it would require upgrading the database (7.3.4 on Red Hat 7.3) via (questionably available) rpms and then compiling Slony itself on that server to install it (which would be tricky since it has no C compiler on it). So rather than open that can of worms up, I instead decided to sniff around the docs looking for whatever I had overlooked, and finally a couple things jumped out at me:
work_mem = 1024000 # min 64, size in KB maintenance_work_mem = 1024000 # min 1024, size in KB fsync = off # turns forced synchronization on or off wal_buffers = 1024 # min 4, 8KB each checkpoint_segments = 300 # in logfile segments, min 1, 16MB each checkpoint_timeout = 3000 # range 30-3600, in seconds log_destination = 'syslog' # Valid values are combinations of redirect_stderr = off # Enable capturing of stderr into log
Hard to believe I overlooked shared_buffers, which is the default setting to change for good performance. Stats_collector was at least counter-intuitive, since I normally try to run with that setting on in production. It’s currently loading up now so I don’t have final numbers, but this will get it down to under 2 hours, which will fit into our downtime window. Yay! If you’re doing database loads, I’d suggest changing these settings first. Hope this helps someone, if you have any other suggestions please post it.
shared_buffers = 50000 # min 16 or max_connections*2, 8KB each max_prepared_transactions = 0 # can be 0 or more stats_start_collector = off