intjonathan

Linkedin
Github
Bluesky
Reddit
Mastodon
Instagram

7 December 2015

Databases & Dragons

by Jonathan Owens

Notes

Delivered at FutureStack 2015, the New Relic Developer Conference in San Francisco. This talk was aimed at teams running services at mild scale (50+ machines, 150+ engineers) and addresses how to choose the right data store for the right job.

Video

Summary

Picking a data store is a lot like rolling the dice — vendor websites are full of buzzwords but tell you little about actual fit. This talk borrows the framing of tabletop RPG character sheets to give each data store a memorable identity based around its stats, strengths, weaknesses, and personality.

Always try to find a storage system that fits your data, not the other way around.

Data types and their stores

Forming a party

In tabletop RPGs you never play alone. By design, the characters must compensate for each other’s weaknesses to have a successful experience. The same applies to data stores. The Synthetics team at New Relic ran Insights (event), S3 (binary), and Postgres (relational) together: each system owned its domain, failures were isolated, and each could be scaled independently.

Rules for a good party of datastores:

  1. Pick stores that compensate for each other’s weaknesses.
  2. Pick stores well-suited to their task. Don’t jam binary data into SQL just because you already have SQL, you’ll burden your database and never scale your binaries. Choose the right tool.
  3. Pick stores you understand. Ten years of MySQL > zero days of Postgres. Most of the widely used storage systems are incredibly versatile and it takes years to develop mastery of them. If you’re not sure what to do, reach for what you know.

A very strong point of leverage here is to have services own their data. Other services should talk to a service API, not reach into its database directly. This prevents the “Death Star architecture” where everything is connected to everything and nothing can move. It also gives you freedom to choose a datastore suitable for the purpose rather than compromising among the many users of a centralized system.

Q&A highlights

tags: