The Database Administrator is going away for sure. The old-school person with magical SQL abilities and knowing all the caveats there are to know about RDBMSs is doomed in the era of "multipurpose" developers. The "DBA as in Database Advocate" idea carries an important weight, but I feel we should augment the notion a bit.
You know this good ol' IT saying: "There are two types of people: those how back up their data and those how are yet to start" Backing up databases is especially important since they usually contain information critical not only to day-to-day business operation, but also for business continuity. People leave, software fails, world changes. What remain is your data and if you have it, you can always start from the scratch - with new software, new people, in new world. Let's have a look at various backups strategies available for PostgreSQL databases so that we can prepare for the worst.
Time flies. It's been almost a year since I've published my What's wrong with DevOps? article. The attention it received came as quite a surprise to me. It seems that there is a number of like-minded individuals sharing my point of view. But what came as even more of a shock was that these individuals also seem to like my writing style. Awesome. No pressure then.
PostgreSQL's MVCC model provides excellent support for running multiple transactions operating on the same data set. One natural consequence of its design is the existence of so-called "database bloat". Judging by the amount of questions raised in the Internet it is quite a common problem and not many people seem to know, how to properly deal with it. I myself had the same issues and learnt one or two things that might be helpful, so in this article I'd like to shed some light on this notion - why it's there in the first place, how it can affect performance and finally how to monitor and manage it.
Proper monitoring is the key to keeping your system in good health and it's especially important in the database field. No matter if you have a busy systems processing a lot of transactions each second or you just want to keep tabs on your data warehouse handeling only few heavy queries - you want to graph most of the metrics in order to be able to see, how they're trending. There are many solutions for doing that, but let me tell you a secret - in this endeavour Grafana will be your best friend.
Databases come in different shapes and sizes and so do policies created by their administrators. One of the strictest and most original is the rule I came across recently: "No VIEWs allowed in production database". When tracking down slow queries and investigating their causes, VIEWs can make the task much harder. And even more complicated is the issue of ownership and responsiblity, especially when things take a wrong turn because of too many or too complex VIEWs. But usually the solution is simple and there's no need to go to such extremes as the rule mentioned. In this case it can be as easy as implementing MATERIALIZED VIEWs.
At the core of any good database is a great locking mechanism to ensure that data is quickly, safely and consistently updated. In my work as a DBA though, I've come across many misconceptions about locking and even some senior developers I talked with were unclear as to the internal workings of PostgreSQL. Understanding some of the concepts should go a long way in helping to create faster and more responsive applications.
You know ELK, right? Elastisearch + Logstash + Kibana stack is now probably a standard tool among system administrators and developers. But if you install ELK and think you're done with your centralized logging system - you're wrong.
"Hello, IT, did you try turning it off and on again?"
I've been asked some time ago by a colleague to tell something more about my adventure with Pentaho and how did I make it play nice with PostgreSQL. So here it goes, here's my story.
Imagine you're ill and you go to a doctor. But instead of letting her examine you, you start to tell her how she should do it, what tools should she use, what medical diagnosis should she give you.
Sounds ridiculous, doesn't it?
I was recently a key resource involved in migrating OLAP1 cubes from proprietary software to FLOSS2 setup utilizing PostgreSQL and Pentaho BI. I learned that OLAP isn't really a PostgreSQL domain but getting basic support for analytical queries will definitely make many tasks easier.
People familiar with MySQL are usually a bit confused with how replication is done in PostgreSQL. This is due to the fact that there are many ways to do this depending on what you want to achieve.
When you're a SysOp and you've just been given a new system to administer one of the first things you should do is to plug it into your monitoring stack. But there is one thing that you can - and should - do before that.
PostgreSQL's query planner is great. There's no doubt about it (at least in the community). But query planning sometimes goes wrong.