Basic control over query planner

2015-05-17T18:35:00 BST │   Comment

PostgreSQL's query planner is great. There's no doubt about it (at least in the community). But query planning sometimes goes wrong.

I'd like to kick my blog off with a slightly edited repost of one of my G+ posts.

In some other relational database management systems there is the notion of query hints. Provided they're properly used, they can be beneficial in some cases.

But Postgres doesn't have them - it's justified very clearly in a blog post as well as in the official wiki here and also here

So what can a DBA do when it's clear that something can be done better than the query planner thinks?

There are few possibilities. The most important one is tuning parameters like "enable_seqscan" or arbitrary cost values like "cpu_tuple_cost". One can also try to control explicit joins with two GUCs1 - "from_collaps_limit" and "join_collapse_limit" - those are especially usefull when dealing with very complex queries using numerous relations.

Whichever you choose to tune you should remember that you can make your changes session-wise or system-wise. While the former is a great tool for testing, you should be very careful with the latter.
When operating in session-wise manner one of the things you can do is to generate query plans using different settings; for example you can check how does the plan look like with some type of scan turned off. Spotting differences in plans generated this way can be extremely helpful.

Using this technique few weeks ago I found that one particular query that had a runtime of about 2 seconds because of index overuse could be simplified with CTE2. This is actually a pretty common way of doing "query hints" the Postgres way - thanks to them you can gain a bit more control over how the query is processed.
But the problem for me was that we're using ORM that doesn't support CTEs. Hah. What now? Of course one way is to bypass the ORM and embed the actual SQL query into the code. But wait? What's a CTE if not a temporary, ad-hoc view?

And there it was - materialized views. Using materialized view instead of CTE gave us a great way of simplifying the query without inelegant solutions like bypassing ORM.
The only thing now is to wait for REFRESH MATERIALIZED VIEW CONCURRENTLY.

  1. grand unified configuration ↩

  2. common table expression(s) ↩

If you like this article please consider sharing it with your friends.