Slides for Big Bad Upgraded Postgres Talk

Howdy folks! I finally got the slides up for the “Big Bad `Upgraded` Postgres” talk which I gave at PGCon 2012 (and previously at PGDay DC). The talk walks through a multi-terrabyte database upgrade project, and discusses many of the problems and delays we encountered, both technical and non-technical. I think the slides stand up pretty well by themselves, but you can also find out some additional info on my co-worker Kieth’s blog, where he has also chronicaled some of the fun times we’ve had along the way. He also has some posts on benefits we’ve seen since upgrading. Anyway, slides are on my slideshare page, please have a look.

I Built a Node Site

Two weekends ago I was in need of website. The local Postgres Users Group is putting on a 1 day mini-conference (featuring some of the best speakers you can get I might add, you should probably go) and we wanted to put up a site with information on the conference. We didn’t need anything fancy, just some static pages with some basic info. We also don’t really have any money, so I wanted something simple that I could toss on-line and be hosted for free, with the caveat that I wanted something I could code (ie. not a wysiwg template thing) because I have some predefined Postgres related graphics and css type stuff I wanted to re-use. After browsing around a little I ran across an interesting service that I almost used called Static Cloud, which is designed to store html, css, and javascript files on-line. This seemed fine for such a simple site, but when I started tossing together the html, I realized I did have some repeatable content that I wanted to repeat (header, footer type stuff). There’s probably a way to do this, but it took me out of my comfortzone, so I decided I should use a scripting language to do my dirty work. I looked at the various PHP, Ruby, and Python offerings, but sadly nothing seemed to fit what I wanted, mainly on the account of them not being free. Then I stumbled upon nodester. Nodester is a node.js based hosting service, which allows you to host node based apps on their servers for free. How friendly! Now, I’ve looked at node before, probably 6+ months ago, and thought it was interesting, but didn’t really have too much use for it at the time. Since then OmniTI has used it for a couple of projects, including one recent project (still ongoing actually) where we built a hefty section of the back-end for a large, asynchronous, services system. And we did it in node.js. So, having seen some of that work, I thought why not give node.js another go around. So, I built a site. It’s not fancy. It’s a half a dozen pages that don’t need to do much. Some files get processed, some pages get displayed. I mostly mention it here because when I started putting it together, I couldn’t actually find anything like this: a complete site that was more than just the most trivial example of how to plumb things together. This doesn’t go much beyond that, but if you are getting your feet wet with node, I think being able to check this site out and just do a “node services.js” and have a real working site to look at, one where you could easily add or modify pages, well it might be handy. Also, it gives me a chance to write a bunch of links I found useful so I can refer back to it. For starters though, the code is on my github. (Yes, I should replumb the routes) I mentioned I used Nodester, so the first thing to check out is the Nodester page, which has a demo about having your app up and running in 1 minute. I hate those kind of demo’s, but it is really freakin’ easy. Here’s another link for wiring up your domain with Nodester. This was something I wanted, and fyi it also works fine for subdomains. Now, I have to give a warning about Nodester. They’ve been having service problems lately (obligatory monitoring graph here), and while they are responsive on twitter, they aren’t proactive. If I were just doing occasional demo’s of my app for people, I’d still use them, but I needed the site to stay up, and I work at a company with massive hosting capabilities, so I did move the site. Sorry Nodester. I did leave a copy of the app running there though. The site itself is written in node yes, but makes use of 2 npm modules, specifically Express and Jade. (Minor note, I hit the ”node env” error, in case you see it). These seem to be the defacto web framework / stack for node stuff, and it works well enough. Here’s the link on wiring up Express apps on Nodster. I also made use of this Express Tutorial from the guys at Nodetuts. I don’t think I actually watch the whole thing, but it was handy getting me over the hump on a couple things. For the Jade stuff, I mostly used the docs and some googling (which tended to end with questions on stack overflow). To be honest, I was tempted to scrap Jade and just use straight HTML, but in the end Jade did seem efficient enough that it was worth the bother.

Intrest Free (Technical) Debt Is Risky

Earlier today I read a post from Javier Salado that asked the question “If the interest rate is 0%, do you want to pay back your debt?”. In this case Javier was referring to technical debt, but I felt like the conclusion he reached was the same mis-understanding that people apply to regular debt. Let me back up a bit. In Javier’s post, he lay’s out the following scenario:

“Imagine you convince a bank (not likely) to grant you a loan with 0% interest rate until the end of time, would you pay back? I wouldn’t. It’s free money. Who doesn’t like free money?”

He then goes on to apply this thinking to technical debt.

“You have an application with, let’s say, $1,000,000 measured technical debt. It was developed 10 years ago when your organization didn’t have a fixed quality model nor coding standards for the particular technologies involved, hence the debt. Overtime, the application has been steadily provided useful functionality to users and what they have to say about it is mainly good. You have adapted to your organization’s new quality process, the maintenance cost is reasonable and any changes you have to make have an expected time-to-market that allows business growth. We could say the interest rate on your debt is close to 0%, why should I invest in reducing the debt?”

I think the answer to both questions is yes, and he makes the same mistake a lot of people do when it comes to taking on debt (technical or otherwise). Calculating the cost of debt cannot be based just on the interest rate alone, you must also factor in risk. In financial transactions, even a debt with 0% interest likely has some form of payment terms and collateral. (One might argue that Javier really meant a loan from a bank that was 0% interest, required no collateral, and had no terms for re-payment. I’d argue that’s a gift, not a loan.) It turns out, 0% interest loans aren’t actually just make believe. A simple example, which is actually a real world example, would be a 0% interest car loan. While this looks great from an interest point of view, it’s not so good from a risk assessment point of view; if you get into an accident, you now owe a bunch of money and no longer have the collateral to pay it off. It’s a double whammy if you figure you might have to deal with fallout from the accident itself.

So the question is, does risk assessment carry over to the technical debt metaphor? I believe it does. In most cases technical debt comes from legacy code, which means the number of people who can work on it are all folks who have been around a long time. In most cases, rather than teach new people how to develop on the legacy system, you just have the “old timers” deal with it when needed. But of course, this is risky, because as time goes by, you probably have fewer and fewer people who can serve in this role. This is a risk. You also have to be aware that, while you have the large amount of managed technical debt, it’s always possible that some new, unforeseen event could occur that changes the dynamic of things. Perhaps a large client / market opens up to you, or some similar opportunity. Perhaps a merger with a new company would be proposed. You now have to re-evaluate your technical situation, and in many cases that technical debt may come back to bite you.

In the end, I don’t think Javier was way off base with his recommendations, which was essentially to follow Elizabeth Naramore’s “D.E.B.T.” system (pdf/slides), to measure your debt and then decide how and what needs to be paid off. But I think it’s important to remember that once you have identified your debt, even if the “interest” on that debt is still low, it does represent risk within your organization (or your personal finances), and you would be best to eliminate as much of it as you can.

Monitoring for “E-Tailers”

As we sit in the midst of record traffic and holiday rushes online, as people scramble to get their gifts ordered and shipped before time runs out, I recently wrote a piece for Retail Info Sys News, talking about various best practices for monitoring web operations during the holiday rush. The folks at circonus asked me to expand on that, which I did in this guest post on the Circonus blog. If you run web operations, do e-commerce, or are just wondering about what goes on behind the scenes, I’d encourage you to check it out.

Cloudy With a Chance of Scale

Recently I met with a company looking for some long term advice on building out their database infrastructure. They had a pretty good mix of scaling vertically for overall architecture, while scaling horizontally by segmenting customers into their own schemas. The had a failover server in place, but as the business was growing, they were looking at ways to better future proof operations against growth, and also build more redundancy into the system, including multi-datacenter redundancy. After talking with them for a bit, I drew up a radical solution: “To The Cloud!” I think I am generally considered a cloud skeptic. Most of how we are taught to scale systems and databases from a technical standpoint doesn’t work well in the cloud. I mean, if you have a good methodology for problem solving you can make a lot of improvements in any environment; we’ve certainly seen that with customers we’ve worked with at OmniTI. But if you are just into looking at low-level numbers, or optimizing performance around disk i/o (generally the most common problem in databases), those methods just aren’t going to be as effective in the cloud. That is not to say that if you are willing to embrace some of the properties of what makes for successful cloud operations, then I think it can be a pretty successful strategy. One of the key factors which I often see overlooked in most “will the cloud work for me” discussions is whether or not your business lends itself well to the way cloud operations work. In the case of this particular client, it’s a really good match. First, this company already segments their customer data, so there is a natural way to split up the database and operations. Second, they don’t do any significant amount of cross customer data, which means they don’t have to re-engineer those bits to make the switch. Further, the customers have different dataset sizes, different access patterns, and different operational needs, and most importantly, they pay different rates based on desired levels of service. This matches up extremely well with a service like Imagine that, instead of buying that next bigger server, instead of setting up cross-data-center WAL shipping, instead of buying machines in a different colo somewhere across the country, instead of all that, they could instead buy individual servers with Heroku, sized according to customer data size and performance needs. For smaller customers you start with minimal resources, and as the customer grows, you dial up the server instance size. Furthermore, you get automated failover setups, and an also easily store backups in a different datacenter based on given regions. You can even work to match customers to different availability zones based on their users endpoints. And if you want to do performance testing or development work, you can create copies of the production databases and hack away. These are the kinds of services OmniTI has built on top of Solaris, Zones, and ZFS, and believe me they will change the way you think about database operations. Of course, it’s not all ponies and rainbows. You still have to move clients on to the new infrastructure, but that should be pretty manageable. You’d also need to build out some infrastructure for monitoring, and you’ll need to be able juggle operational changes. Some of this is not significantly different; pushing DDL changes across schemas is pretty similar to doing it across servers, but you’ll probably want to create some toolsets around this. Also you’re less likely to bear fruit from micro-optimizations; that doesn’t mean that you throw away your pgfouine reports, but the return on performance improvements and query optimization will be much lower. That said, if you can get good enough performance for your largest customers (and remember, you’ll have easy capabilities for distributing read loads), you end up an extremely scalable system, not just technically, but from a business standpoint as well. If you aren’t building this on top of Heroku’s Postgres service, the numbers will probably look different, but the idea that you’ve matched your infrastructure capabilities to a significant range of possible growth patterns should be compelling for both suits and the people who maintain the systems.

Checkpoints, Buffers, and Graphs

Last night at BWPUG, Greg Smith gave his talk on “Managing High Volume Writes with Postgres”, which dives deep into the intersection of checkpoint behavior and shared buffers, and also into dealing with vacuum. One of the things I always like about Greg’s talks are it’s a good way to measure what we’ve learned between reading code and running large scale / highly loaded system in the wild. Even in the cases where we disagree, it’s good to get a different point of view on things. If you manage Postgres systems and get the chance to see this talk, it’s worth taking a look (and I suspect he’ll post the slides up somewhere this week, if they aren’t already available). One of the other cool things that came out of the talk was one of the guys on my team again validating why we love working with Circonus. We have an unofficial slogan that with Circonus, “if you can write a query, you can make a graph”. Well, Keith noticed that we didn’t have any monitoring for the background writer info on one of our recently upgraded from 8.3->9.1 multi-TB Postgres, so he jump into Circonus and just like that, we had metrics and a graph faster than Greg could move off the slide. This will be awesome once we accumulate some more data, but here’s a screenshot I took from last night while we were in the talk: Circonus | View Graph Yay graphs! Update: Shortly after posting, Keith mentioned that he had updated the graph to speak in MB rather than Buffers. So, here is an updated screenshot with friendlier output and more data. (Note that Phil, one of our other DBA’s, also flipped the buffers allocated to a right axis as well). Circonus | View Graph

Understanding Postgres Durability Options

Most people tend to think of Postgres as a very conservative piece of software, one designed to “Not Lose Your Data”. This reputation is probably warranted, but the other side of that coin is that Postgres also suffers when it comes to performance because it chooses to be safe with your data out of the box. While a lot of systems tend to side towards being “fast by default”, and leaving durability needs as an exercise to the user, the Postgres community takes the opposite approach. I think I heard it put once as “We care more about your data than our benchmarks”.

That said, Postgres does have several options that can be used for performance gains in the face of durability tradeoffs. While only you can know the right mix for your particular data needs, it’s worth reviewing and understanding the options available.

“by default” - OK, this isn’t a real setting, but you should understand that, by default, Postgres will work to ensure full ACID guarantees, and more specifically that any data that is part of a COMMIT is immediately synched to disk. This is of course the slowest option you can chose, but given it’s also a popular code path the postgres devs have worked hard to optimize this scenario.

“synchronous commits” - By default synchronous_commit is turned on, meaning all commits are fsyncd to disk as they happen. The first trade-off of durability for performance should start here. Turning off synchronous commits introduces a window between when the client is notified of commit success, and when the data is truly pushed to disk. In affect, it let’s the database cheat a little. The key to this parameter is that, while you might introduce data loss, you would never introduce data corruption. Since it tends to produce significantly faster operations for write based workloads, many people find that is a durability tradeoff they are willing to make. As an added bonus, if you think that most of your code could take advantage of this but you have some certain part of your system that you can’t afford the tradeoff, this setting can be set per transaction, so you can ensure durability in the specific cases where you need it. That level of fined grained control is pretty awesome.

“delayed commits” - Similar sounding in theory to synchronous_commit, the settings for “commit_siblings” and “commit_delay” try to provide “grouped commits”, meaning multiple transactions are committed with a single fsync() call. While this certainly has the possibility of increasing performance in a heavily loaded system, when the system is not loaded these will actually slow down commits, and that overall lack of granularity compared to synchronous_commit usually means you should favor turning off synchronous_commit and bypass these settings when trading off durability for performance.

“non-synching” - Fsync was the original parameter for durability vs performance tradeoffs, and it can still be useful in some environments today. When turned off, postgres throws out all logic of synchronizing write activity with command input. This does mean that running in this mode, in the event of hardware or server failure, you can end up with corrupt, not just missing, but corrupt data. In many cases this might not happen, or might happen in an area that does matter (say a corrupt index, that you can just REINDEX), but it could also happen within a system catalog, which can be disastrous. This leads many a Postgres DBA to tell you to never turn this off, but I’d say ignore that advice and evaluate things based on the tradeoffs of durability vs performance that are right for you. Consider this; if you have a standby set up (WAL based, Slony, Bucardo, etc…), and you are designing for a low MTTR, chances are in most cases hardware failure on the primary will lead to a near immediate switch to the standby anyway, so a corrupt database that you have already moved beyond will be irrelevant to your operations. This assumes that you can afford to lose some data, but if you are using asynchronous replication, you’ve already come to that conclusion. Of course, you are giving up single node durability, which might not be worth the tradeoffs in performance, especially since you can get most of the performance improvements with turning off synchronous_commits. In some situations you might fly in the face of conventional wisdom and turn off fsync in production, but leave it on in development; imagine an architecture where you’ve built redundancy on top of ec2 (so a server crash means a lost node), but you are developing on a desktop machine where you don’t want to have to rebuild in the case of a power failure, and don’t want to run multiple nodes.

Life is a series of tradeoffs between cost and efficiency, and Postgres tries to give you the flexibility you need to adjust to fit your particular situation. If you are setting up a new system, take a moment to think about the needs of your data. And before you replace Postgres with a new system, verify what durability guarantees that new system is giving you; it might be easier to set Postgres to something comparable. If you are trying to find the right balance on your own situation, please feel free to post your situation in the comments, and I’ll be happy to try to address it.

PGDay Denver 2011 Wrap-Up

Last Friday was the first PGDay Denver, a regional one day Postgres conference, put on by Kevin Kempter and the folks who run the Denver Postgres User Group. We had between 50 and 75 people, which is pretty good turnout for a first time event. I gave two talks, my “Essential PostgreSQL.conf” talk (slides here) and my “Advanced WAL File Management with OmniPITR” talk (slides here). It was my first time in Denver (outside of the airport at least), and I have to say that the city is very well laid out for conference goers. The one tricky part was getting from the airport to downtown, but once you are downtown, their are plenty of good places to eat/drink, plenty of hotels, and the conference center itself is massive. After a couple nights on the town, I was honestly left surprised that I hadn’t been to a conference here before (maybe OSCon should swing through some year?) and hoping I’ll get the chance to go back. In any case, thanks to the PGDay Denver folks for putting together a nice event, and hopefully we’ll see others follow their lead with more PGDay’s in their part of the country.

Reminder: BWPUG Meeting Tomorrow, Sept 20th

Hey Folks! Looks like we had a snafu with the Meetup site where it was showing the meet on our old schedule last week rather than the new schedule. We’re in the process of fixing that, but wanted to make sure everyone knew that we are still going to meet on our new night, which is tomorrow, Tuesday, September 20th. This month, Theo will talk about application and systems performance measurement and why almost everyone does it wrong. It’s not hard to do right, but people often approach these things completely wrong. So, we’ll look at some numbers, understand why they are misleading and talk about the right way to approach these problems. Since we can’t always approach things the right way, we’ll talk a bit about adding a tiny bit of value to the “wrong” approach. When: September 20th, ~6:30PM. Where: 7070 Samuel Morse Dr, Columbia, MD, 21042. Host: OmniTI As always we will have time for networking and we can do some more open Q & A, and we’ll likely hit one of the local restaurants after the meet. BWPUG Meetup Page BWPUG Mailing List

The Opportunity of Crises

Self-reflection and process analyzation are two critical components to success that I think people all too often overlook. When things are going good, people think they are doing things correctly, so they don’t need to self-analyze. Worse, when things are going bad, they often try to rationalize the problem away. I think this is missing a golden opportunity for most people, businesses, or teams. This past week I wrote a piece on the OmniTI Seeds blog discussing this topic; if you happen to lead such a group, you owe it to yourself and those around you to recognize The Opportunity of Crises.