Percona Live Europe is a yearly conference on open source database organized by Percona. We had the opportunity to attend this year’s conference in the beautiful city of Dublin. Beside enjoying the local brews and drafts we attended several sessions out of which we highlight some sessions in this blog post.
MongoDB Shootout: MongoDB Atlas, Azure CosmosDB and Doing It Yourself
When running MongoDB in the cloud, you have several options. David Murphy compared MongoDB Atlas, CosmosDB and the good old DIY.
Using Atlas, you get monitoring, automation and the possibility to pay for backups. You pay per instance and you have a wide variety of instances and regions to choose from on AWS, GCP and Azure. The biggest downside is that the cost is about 44% more than running you own servers in the cloud. This means that, in contrast to DIY, you continuously pay more instead of writing off your initial investment and paying less in the end. The monitoring is really good but the problem here is that if you have have a polyglot environment with your own monitoring and alerting, you can not integrate it with Atlas so you end up with yet another tool. Upgrading is really easy and just a click of the button thanks to the automation. Backups need to be paid for per GB and are taken continuously.
CosmosDB is offered by Microsoft on Azure and claims to be MongoDB compatible. This is not completely true because they have no support for the aggregation framework. So you can only use it for simple CRUD operations. The pricing is based on a pay-per-operation model which means it’s really hard to figure out what your cost will be and how it will evolve over time. Here you also have the downside of continuously paying more than DIY. The monitoring is very basic and of no help when you run into problems or strange behaviour, it’s like a black box. Upgrading is done behind the scenes which means you don’t have to worry unless the upgrade means your code is no longer compatible, then you are stuck. Backups are very basic because they are taken approximately every four hours and only the latest 2 backups are stored.
DIY has the most power to offer IF you have a mature, and complete DevOps team. With DIY you pay a high price up-front for hardware and people. You need to implement your own monitoring, i.e. with the Elastic stack or Percona’s PMM. Also, automation is a big part of the effort. And last but not least, you need to implement your own backup strategy. The biggest upside is that you have full control over what you implement and how you do things. You choose the cloud service provider or the hardware you want to use. You choose what you want to monitor and how you alert. You choose how often and when you backup. But of course, it’s all up to you.
Visualize Your Data With Grafana
Grafana is used to build monitoring dashboards based on time series data. Grafana supports a wide range of data sources to get its data and generate the dashboard. There are already a lot of pre-built dashboards you can use and customize for your own needs. You can build your own dashboards with panels. Each panel is fed with data from a datasource based on a query. Grafana has created a query editor with support for different data sources, like PromQL for Prometheus, to make it easier to build the queries you need. The end result can look like this :
Database Reliability Engineering: What, Why and How?
The DBA from the old days, hidden in a basement behind closets, performing magic on that mysterious thing called a database, is dead. Enter the Database Reliability Engineer (DBRE, loosely based on SRE), a person who embraces the new paradigms in the IT-world. He is an advocate of how data should be treated and used, he teaches his colleagues, he takes part in pair-programming, he is an active team member in cross-functional teams. A DBRE’s knowledge is not confined to a single system, he can support polyglot persistence. He can support these systems on premise and in the cloud. He automates as much as possible and uses tools of the trade including source control systems and helps creating infrastructure as code. The DBRE enables his organisation to apply known principles of the software engineering world to the database world. In this role he applies principles from Database Reliability Engineering, like designing for scale, availability, operations and performance. Also visibility, alerting and database change and release management are just a few parts of the tasks to do. For more detailed information make sure to check out the book Database Reliability Engineering, a must read for everyone in the field.
MongoDB Security Checklist
MongoDB has been in the news lately due to MongoDB ransomware attacks. This might make you wonder whether or not MongoDB is secure. Well, rest assured it is very secure. But you need to turn security on, at least until the next major release where security will be on by default. MongoDB has a plethora of security features in their community edition and the commercial offering provides even more goodies like LDAP integration and baked in encryption-at-rest. It starts with simple username/password authentication and moves on to x.509 certificates based authentication. Once authenticated you have authorization with either build-in or user-defined roles and privileges, so you can fine-tune which users have access to which database or collection and which actions they can perform on them. You can further lock-down your MongoDB by fixing the network interface it is listening to so it’s not open to the internet, or encrypting the communication between replica-set or sharded-cluster members. If you are running MongoDB, then reading the security checklist is a must!
Improvements to MongoRocks in 2017
MongoRocks is MongoDB using RocksDB as the underlying storage engine. From the MongoRocks website :
"RocksDB is a key-value library based on Log Structured Merge Trees. It is maintained by the Facebook Database Engineering Team, and is based on LevelDB, by Sanjay Ghemawat and Jeff Dean at Google"
MongoRocks differs from WiredTiger in the way it stores data. WiredTiger uses a B-tree where MongoRocks uses a LSM-tree structure. Both have pros and cons of course. An LSM-tree structure favours space and insert efficiency over read efficiency. An B+ tree structure favours update and read efficiency over space efficiency. So depending on your workload you can choose which might suit you better. Of course, nothing beats measuring what the effect of your workload is on performance of the choosen storage engine. Because, you know, silver bullets and such… So testing and measuring is key in deciding which engine you should choose. Nevertheless, MongoRocks is showing nice improvements over previous versions and has several interesting benefits over WiredTiger. Certainly when storage endurance is an issue or if your working set does not fit into memory.
Automatic Database Management System Tuning Through Large-Scale Machine Learning
This is probably the most stunning talk of the conference. OtterTune is a tool developed by students and researchers at Carnegie Mellon to automatically tune your database. This is done by making clever use of previously collected data of other tunings and applying machine learning to it. The presented results showed that for the given workload, OtterTune was at par with DBAs which had double digit years of experience. Looking at this from the bright side, OtterTune would help DBAs to focus on areas other than figuring out which combination of the multiple settings they should use to tune their database. It would definitely help to do better Database Reliability Engineering.
This is just a small portion of the huge amount of sessions at Percona Live, but of course, one needs to choose. It’s really great to see this conference putting open source on the foreground and displaying the wealth of choice and diversity of technologies in the open source database world. We see that this space is continuously expanding and that the future is looking even more promising than the present. Good times ahead!