Just got done with a slew of various website updates and wanted to give a pointer to 2 new articles we added to techdocs. Both by Carlos Crosetti, one is [http://www.geocities.com/ccrosetti/otl/index.html Programming OTL using PostgreSQL and unixODBC in the Solaris environment], the other [http://www.geocities.com/ccrosetti/otlw/index.html Programming OTL using PostgreSQL and ODBC in the Windows environment]. Both have a bunch of pretty pictures, always a bonus, so go check them out.
With the help of several members within the community, I’ve compiled a list of [http://www.postgresql.org/about/users well-known companies and organizations that are making use of PostgreSQL]. It includes both companies who use us as a traditional enterprise style RDBMS (Afilias, The National Weather Service) and companies that have found PostgreSQL as a good database to use to build thier own products on (Apple, OpenMFG). In some cases we’ve provided pointers to public information including articles, case studies, or on-line documentation so you can see how these organizations make use of PostgreSQL.
The list doesn’t included companies like CNN, ITN, Fox News, MTV, or Disney who some of us in the postgresql community know make use of postgresql by way of third party products and services they use. I based this decision on the thought that most of these companies probably don’t know they use PostgreSQL, so they probably aren’t the best candidates for this list. (However maybe you should keep in mind that every company that has an Apple runs PostgreSQL and the BSD operating system, nothing to sneeze at)
Incidentally, if you know of any “name” companies making use of PostgreSQL who don’t consider it a trade seceret and would be willing to go on record, please do drop me a line. By name companies I am thinking companies that are [http://bwnt.businessweek.com/global_1000/2003/index.asp Global 1000], or [http://metamoney.com/globalListIndex.html Web 100], or have similar name recognition. We know they’re out there, we just need to convince them to speak up. [[image /xzilla/templates/default/img/emoticons/smile.png alt=”:-)” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]]
Nat Torkington recently had a pointer to a rather serious topic about using google to see peoples prejudices towards others, known as the [prejudice map](http://blog.outer-court.com/prejudice/). It is probably worth looking at on its own merits, but rather than focus on something mundane like improving humanity, I thought I’d try this technique on something much more important: geek religious wars! To that end here are the results of some common thoughts in the open source database community:
PostgreSQL is known for: features, stability, reliability
MySQL is known for: simplicity, reliability, speed, scaling, data corruption and performance.
And Sqlite? Apparently it’s known for not having any documents found [[image /xzilla/templates/default/img/emoticons/tongue.png alt=”:-P” style=”display: inline; vertical-align: bottom;” class=”emoticon” /]]
PostgreSQL core-team member Bruce Momjian was [http://bsdtalk.blogspot.com/2006/02/bsdtalk015-interview-with-postgresql.html recently interviewed] on the BSDTalk podcast. The overall podcast feel is pretty low-key, and the audio volume on Bruce’s segment is a little low, but if you turn up your volume it should be clear enough. Bruce hits on a number of topics including PostgreSQL on Windows, the PostgreSQL [http://www.bookpool.com/sm/1590595475 book] market, and in-place upgrades, so if you’re currently using postgresql or just getting started, I’d say give it a twirl.
Jeremy Zawadony had [http://jeremy.zawodny.com/blog/archives/006199.html a pointer to a project called ajaxMyTop] which is an ajax-ified version of his mytop program which does process listing for mysql. The [http://sourceforge.net/project/screenshots.php?group_id=156708 screenshots] looked cool, but I really wanted to play around with it so I converted it to PostgreSQL. If you’d like to play with it, you can find the code on the [http://sourceforge.net/tracker/index.php?func=detail&aid=1422853&group_id=156708&atid=800870 ajaxmytop sourceforge patch page].
==ED NOTE==
I’ve re-uploaded the tarball to the sourceforge page so it should now be working. Probably forgot to check a box or something. Also it’s not a patch, it’s the full code needed, so you can just download and install.
If you go back about a year, and certainly not more than 2, you’ll find the big commercial db vendors all claiming to be ignoring the open source database offerings. Generally the claim was that you might use an open source database for really small work, but for anything serious the open source databases (and really they were focused on mysql) just weren’t feature complete enough to compete with their products. So with IBM now joining Microsoft and Oracle in [http://www.earthtimes.org/articles/show/5188.html offering a free version of their db’s], is there something we can come away with? I think it’s this: the commercial databases cannot compete on features any more.
Basically databases tend to compete on price and features. For years these commercial vendors have stuck to higher prices justified by better features, but that’s just no longer the case. 2005 really marked the end of that trend for two reasons; First, mysql 5.0 was released, which pretty well rounded out the check-mark feature list that most pointy hairs look at when deciding on a database technology. Second, PostgreSQL (which was already feature complete) hit the mainstream tech press, with companies like Pervasive, GreenPlum and EnterpriseDB making use of PostgreSQL and helping push it’s name into the commercial tech press (and the rest of the public) in ways that [http://www.tmcnet.com/usubmit/2006/02/01/1332690.htm only commercial companies can do]. These two events meant that people in the techworld who weren’t really up on their open source databases were now realizing that there really was something to look at, and that the proprietary guys just weren’t offering any value for their higher prices. And so those proprietary vendors will now try to compete on price as well.
Eventually they will lose. As more people begin to use PostgreSQL, it will only get better and better. Right now the sweet spot of PostgreSQL can be summed up very nicely by looking at the free db2 market: two processors, 4GB of RAM and unlimited database size. I use machines like that and can get systems into the 100’s of transactions per second range, which covers a lot of ground for most people. I’ll grant, if you have a couple of million dollars to throw at a database solution, you should investigate the big three and see how they stack up against the open source offerings, but if your in the sweet spot, PostgreSQL is your no-brainer choice. Why? Since the competition is even based on price and features (evidenced from the new “free” db’s), PostgreSQL still offers you the advantage of a real competition of vendors based on service without the need to change technologies. Whether you’re hiring me or you’re hiring [http://www.sun.com/software/solaris/postgres.jsp Sun] or something in between, PostgreSQL gives you a true market place of support avenues, rather than a single vendor solution. All else being equal, PostgreSQL still comes out ahead.
I’ve decided that my quest to port one of my systems from 7.3 -> 8.1 is turning into a case study on why I shouldn’t upgrade. It’s not just the [http://archives.postgresql.org/pgsql-performance/2006-01/msg00265.php query trouble], it’s the little quirks that that trip you up as well. Todays topic? Controlling Logging For Non-Superusers…
So the problem I face is very simple. One of the apps that hits the 7.3 database is an xml-rpc service that runs on about a half-dozen servers, doing a couple thousand transactions every minute. When something goes awry, it’s hard to debug, especially on 7.3. So one thing we do is include a debugging flag on each server that allows it to turn on sql logging by issuing the proper SET command. Unfortunatly I looked in my 8.1 log where we’re doing some testing and ran across this:
2006-01-30 14:03:03 EST : 11267 : 10.25.10.54 : ERROR: permission denied to set parameter "log_statement"
2006-01-30 14:03:03 EST : 11267 : 10.25.10.54 : STATEMENT: SET log_statement = on
Ugh. Apparently this behavior was changed somewhere after 7.3. I say somewhere becuase I didn’t see any mention of it in the release notes. (Bonus points for anyone who posts a link in the comments) Luckily it’s not a complete catastrophe. Basically you need to write a function that allows non-superusers to turn the logging on and off. In my case, I don’t need to duplicate the full functionality of log_statement, just the basic idea of turning logging on or off, so I whipped out this little function:
CREATE OR REPLACE FUNCTION log_statement(boolean) RETURNS text AS $$
DECLARE
param text;
retval text;
setsql text;
devnull text;
BEGIN
IF $1 THEN
param := 'all';
retval := 'on';
ELSE
param := 'none';
retval := 'off';
END IF;
setsql := 'SELECT set_config(\'log_statement\',\'' || param || '\',true)';
RAISE DEBUG '%',setsql;
EXECUTE setsql INTO devnull;
RETURN retval;
END
$$ LANGUAGE 'plpgsql' SECURITY DEFINER;
You can could probably shorten the code down a little, but it will work like this and should be pretty simple to read. If you really used this functionality alot, you might want to do some type of overloaded version of set_config itself and make it security definer so you could control all of the log_* parameters but that’s a bit beyond what I needed. The only downside here is that we still need to touch application code to implement the change, but I guess you can’t have everything.
More [http://people.planetpostgresql.org/devrim/index.php?/archives/35-Solaris-packages-for-PostgreSQL.html good news for Solaris users], we’ve added a new document to the PostgreSQL [http://techdocs.postgresql.org Techdocs] site, this one discussing some best practices for [http://techdocs.postgresql.org/techdocs/pg_solaris10_jdbc_v1.2.1.pdf deploying PostgreSQL 8.1 on Solaris] (pdf). I don’t use Solaris personally, but found it a nice read anyway, so even if you DBA on other platforms I’d recommend giving it a once-over. Many thanks to Chris Drawater for another fine contribution.
Probably the most well known way to look at settings is by using the SHOW ALL command in psql. This commands basically does what it says; it shows all the postgresql variables as a resultset onto your screen. It works ok, but I’ve found when doing tuning it really gives me a lot of extraneous information that I don’t need, and doesn’t give me some of the information I do need. So instead I look at [http://www.postgresql.org/docs/current/static/view-pg-settings.html pg_settings].
By using pg_settings, you can query on specific information relevant to your current session. Want to see all of the values you’re using that aren’t defaults, try this:
bjt=# select name, setting, source, category from pg_settings where source <> 'default';
name | setting | source | category
--------------------------------+-------------------------------------+--------------------+------------------------------------------------------
autovacuum | on | configuration file | Auto Vacuum
checkpoint_segments | 8 | configuration file | Write-Ahead Log / Checkpoints
checkpoint_timeout | 600 | configuration file | Write-Ahead Log / Checkpoints
config_file | /var/lib/pgsql/data/postgresql.conf | override | File Locations
data_directory | /var/lib/pgsql/data | override | File Locations
effective_cache_size | 90000 | configuration file | Query Tuning / Planner Cost Constants
enable_nestloop | on | session | Query Tuning / Planner Method Configuration
fsync | on | configuration file | Write-Ahead Log / Settings
geqo_effort | 5 | session | Query Tuning / Genetic Query Optimizer
hba_file | /var/lib/pgsql/data/pg_hba.conf | override | File Locations
ident_file | /var/lib/pgsql/data/pg_ident.conf | override | File Locations
join_collapse_limit | 25 | session | Query Tuning / Other Planner Options
lc_collate | C | override | Client Connection Defaults / Locale and Formatting
lc_ctype | C | override | Client Connection Defaults / Locale and Formatting
lc_messages | C | configuration file | Client Connection Defaults / Locale and Formatting
lc_monetary | C | configuration file | Client Connection Defaults / Locale and Formatting
lc_numeric | C | configuration file | Client Connection Defaults / Locale and Formatting
lc_time | C | configuration file | Client Connection Defaults / Locale and Formatting
listen_addresses | * | configuration file | Connections and Authentication / Connection Settings
log_destination | syslog | configuration file | Reporting and Logging / Where to Log
log_directory | pg_log | configuration file | Reporting and Logging / Where to Log
log_line_prefix | :%h: | configuration file | Reporting and Logging / What to Log
log_min_duration_statement | 5000 | configuration file | Reporting and Logging / When to Log
log_min_error_statement | error | configuration file | Reporting and Logging / When to Log
log_rotation_age | 1440 | configuration file | Reporting and Logging / Where to Log
log_rotation_size | 0 | configuration file | Reporting and Logging / Where to Log
log_statement | ddl | configuration file | Reporting and Logging / What to Log
log_truncate_on_rotation | on | configuration file | Reporting and Logging / Where to Log
maintenance_work_mem | 32768 | configuration file | Resource Usage / Memory
max_connections | 1120 | configuration file | Connections and Authentication / Connection Settings
max_fsm_pages | 200000 | configuration file | Resource Usage / Free Space Map
max_prepared_transactions | 0 | configuration file | Resource Usage
max_stack_depth | 8192 | configuration file | Resource Usage / Memory
port | 5432 | command line | Connections and Authentication / Connection Settings
random_page_cost | 3.5 | session | Query Tuning / Planner Cost Constants
redirect_stderr | off | configuration file | Reporting and Logging / Where to Log
server_encoding | SQL_ASCII | override | Client Connection Defaults / Locale and Formatting
shared_buffers | 50000 | configuration file | Resource Usage / Memory
statement_timeout | 2700000 | configuration file | Client Connection Defaults / Statement Behavior
stats_command_string | on | configuration file | Statistics / Query and Index Statistics Collector
stats_row_level | on | configuration file | Statistics / Query and Index Statistics Collector
stats_start_collector | on | configuration file | Statistics / Query and Index Statistics Collector
superuser_reserved_connections | 20 | configuration file | Connections and Authentication / Connection Settings
temp_buffers | 3200 | configuration file | Resource Usage / Memory
TimeZone | EST5EDT | command line | Client Connection Defaults / Locale and Formatting
transaction_isolation | read committed | override | Client Connection Defaults / Statement Behavior
transaction_read_only | off | override | Client Connection Defaults / Statement Behavior
wal_buffers | 1024 | configuration file | Write-Ahead Log / Settings
work_mem | 1024 | configuration file | Resource Usage / Memory
(49 rows)
of course, even this might be too verbose, but the beauty is that you can make these queries as simple or complex as you’d like, for example if you’re doing performance tuning, you might want something fancy like:
bjt=# select name, setting, source, context from pg_settings where category ~ 'Tun|Memory' order
bjt-# by source='session' desc, source;
name | setting | source | context
---------------------------+---------+--------------------+------------
enable_nestloop | on | session | user
geqo_effort | 5 | session | user
join_collapse_limit | 25 | session | user
random_page_cost | 3.5 | session | user
effective_cache_size | 90000 | configuration file | user
maintenance_work_mem | 32768 | configuration file | user
max_stack_depth | 8192 | configuration file | superuser
shared_buffers | 50000 | configuration file | postmaster
temp_buffers | 3200 | configuration file | user
work_mem | 1024 | configuration file | user
constraint_exclusion | off | default | user
cpu_index_tuple_cost | 0.001 | default | user
cpu_operator_cost | 0.0025 | default | user
cpu_tuple_cost | 0.01 | default | user
default_statistics_target | 10 | default | user
enable_bitmapscan | on | default | user
enable_hashagg | on | default | user
enable_hashjoin | on | default | user
enable_indexscan | on | default | user
enable_mergejoin | on | default | user
enable_seqscan | on | default | user
enable_sort | on | default | user
enable_tidscan | on | default | user
from_collapse_limit | 8 | default | user
geqo | on | default | user
geqo_generations | 0 | default | user
geqo_pool_size | 0 | default | user
geqo_selection_bias | 2 | default | user
geqo_threshold | 12 | default | user
(29 rows)
That certainly gives you a quick overview of your current setting within your session when your doing performance tuning. And also don’t forget you can also create views of these types of queries if you want:
bjt=# create view "dba.tune" as select name, setting, source, context from pg_settings where
bjt-# category ~ 'Tun|Memory' order by source='session' desc, source;
CREATE VIEW
… and make use of the info as you need…
bjt=# select * from "dba.tune" where source <> 'default';
name | setting | source | context
----------------------+---------+--------------------+------------
enable_nestloop | on | session | user
geqo_effort | 5 | session | user
join_collapse_limit | 25 | session | user
random_page_cost | 3.5 | session | user
effective_cache_size | 90000 | configuration file | user
maintenance_work_mem | 32768 | configuration file | user
max_stack_depth | 8192 | configuration file | superuser
shared_buffers | 50000 | configuration file | postmaster
temp_buffers | 3200 | configuration file | user
work_mem | 1024 | configuration file | user
(10 rows)
John Drummond gives [http://www.johndrummond.com/?p=7 a break down on what has lead him to look at postgresql]. It’s interesting to think what a good combo C# and PostgreSQL for developers looking for a platform for the next 10 years. There are certainly a number of windows users out there, but this also gives them an easy migration path between Win32 development and the LAMP stack (linux/apache/mono/postgresql) should they want it.