I do a lot of hacking on my laptop, which happens to run Ubunutu, whose postgresql package defaults to en_US.UTF-8 for the locale on new databases. This creates chronic problems when working with tsearch enabled database, which I do often enough when hacking on things like pagila or mediawiki. The problem is that notorious “could not find config by locale” since my default locale isn’t included in tsearch defaults.
pagila=# select * from pg_ts_cfg;
ts_name | prs_name | locale
-----------------+----------+--------------
default | default | C
default_russian | default | ru_RU.KOI8-R
simple | default |
(3 rows)
pagila=# select to_tsvector('I wish this worked');
ERROR: could not find tsearch config by locale
I think the real fix for this is to actually create a new dictionary and then configure tsearch to use it with your locale, but this method is kind of cumbersome, and actually something I've not been able to do on Ubunut (most direction talk about using ispel but Ubunut doesn't give you everything you need with ispell, using aspell instead. I walked through the steps on aspell once but something didnt work out... can't recall what though now. So rather than figure out the right way to do it, I've started using this hack.
pagila=# show lc_collate;
lc_collate
-------------
en_US.UTF-8
(1 row)
pagila=# update pg_ts_cfg set locale = 'en_US.UTF-8' where ts_name = 'default';
UPDATE 1
pagila=# select to_tsvector('I wish this worked');
to_tsvector
-------------------
'wish':2 'work':4
(1 row)
While this works, and in theory should be correct-ish.. since C and en are pretty darn close, I'm wondering if anyone can tell me the downsides of it.... if there aren't any this might be good to list as a simple solution for other people to use.