PlanetScale for Postgres is here. Request early access
Navigation

Blog|Engineering

Benchmarking Postgres

By Benjamin Dicken |

Today we launched PlanetScale for Postgres. For the past several months, we've been laser focused on building the best Postgres experience on the planet, performance included.

To ensure we met our high standard for database performance, we needed a way to measure and compare other options with a standardized, repeatable, and fair methodology. We built an internal tool, "Telescope", to be our go-to tool for creating, running, and assessing benchmarks.

We have used this as an internal tool to give our engineers quick feedback on the evolution of our product's performance as we built and tuned it. We decided to share our findings with the world, and give others the tools to reproduce them.

If you'd like to skip straight to the benchmark results, they are here:

And here is a quick overview of how PlanetScale performs compared to other Postgres vendors:

In every aspect of our business we aim for excellence and continual improvement. We encourage you to reach out at benchmarks@planetscale.com if you see any mistakes in our methodology or cost calculations.

What is benchmarking?

The way benchmarking is used is often deceptive. This applies to all technologies, not just databases.

So let's be clear, benchmarking of any kind has its shortcomings. Every OLTP workload at every organization is unique and no single benchmark can capture the performance characteristics of all such databases. Data size, hot:cold ratios, QPS variability, schema structure, indexes, and 100 other factors determine the requirements of your relational database setup.

You cannot look at a benchmark and know for sure that your workload will perform the same given all other factors are the same.

However, when a benchmark is conducted well, it is quite useful for answering the following questions:

  1. How quickly can I reach my database? (latency)
  2. How does the database perform under a "typical" OLTP load? (TPS, QPS, etc)
  3. How does the database perform under high read / write pressure? (IOPS, caching)
  4. How much does it cost to achieve some bar of performance relative to other options? (price:performance ratio)

These are the types of questions we set out to answer with our benchmarking. Therefore, we chose three benchmarks as our primary measurement tools:

  • Latency: A simple query-path latency benchmark. This repeatedly runs SELECT 1; statements on the database from another instance in the same region. Very simple, but effective to determine basic query-path latency.
  • TPCC: We used the industry-standard TPCC benchmark to help answer questions 2-4. More details will be provided on the configuration later in this post.
  • OLTP Read-only: For a selection of the top-performers, we also run an OLTP Read-only sysbench workload. This is useful for isolating read performance, as most OLTP workloads are 80+% reads.

Fairness

In this process, we compared our product to a long list of other cloud Postgres providers. We wanted the comparisons to be as fair as possible.

Every benchmark we are releasing publicly was compared to PlanetScale running on an i8g M-320. That's 4vCPUs, 32GB RAM, and 929GB of NVMe SSD storage. We chose an instance size and cluster configuration that represents what would be typical for a real production application with at least a moderate amount of usage. For example, a database that can serve several-thousand QPS while maintaining low-latency and high-availability.

At PlanetScale, we give you a primary and two replicas spread across 3 availability zones (AZs) by default. Multi-AZ configurations are critical to have a highly-available database. The replicas can also be used to handle significant read load.

For the other products we compared to, we kept things simple by creating single-instance databases that match or exceeded the vCPUs and RAM of the PlanetScale primary. To match the true capacity and availability of PlanetScale, each would also need to add replicas. We account for this when discussing pricing.

Compute

For each competitor, we either matched or exceeded both the vCPU count and RAM. Amazon Aurora, Google AlloyDB, and CrunchyData support memory-optimized RAM:CPU ratios of 8:1, so we were able to match it exactly.

Supabase, TigerData, and Neon only support 4:1 RAM:CPU ratios. For these, we opted to match the RAM, giving them double the CPU count used by PlanetScale. This is an unfair advantage to them, but as you'll see, PlanetScale still significantly outperforms with less resources.

Storage

All of the products we compared with use network-attached storage for the underlying drives. Of these, some do not allow configuration of the specific IOPS, such as Aurora, Neon, and AlloyDB.

Several do however. For Supabase and TigerData, we gave boosts to the default IOPS settings.

Methodology

In the interest of full transparency, we provide full details for how we conducted our benchmarking. Every benchmark was run under the following conditions:

  • All databases and benchmark machine resources were run from within the same cloud region. For all but the Google products, this meant running in us-east-1. For Google, us-central1.
  • Except for the Latency benchmarks, we do not provide guarantees down to the availability-zone level. Not all platforms allow you to specify which AZ the database node should reside in (nor do they expose this). Thus, it is impractical to make guarantees around this for all providers.
  • All AWS-based benchmarks were run from a c6a.xlarge (4 vCPUs, 8 GB Memory) in us-east-1. All GCP-based benchmarks were run from a e2-standard-4 (4 vCPUs, 16 GB Memory) in us-central1.
  • All Postgres configuration options are left at each platform's defaults. The one exception to this is modifications to connection limits and timeouts, which may be modified to facilitate benchmarking.
  • For the TPCC benchmark, the data was generated using the Percona TPCC scripts with TABLES=20 and SCALE=250. This produces a ~500 gigabyte Postgres database. We provide instructions to reproduce these for yourself.
  • The OLTP benchmark uses sysbench's built-in oltp_read_only benchmark. We provide instructions to reproduce these for yourself.
  • The latency benchmark is quite simple: We run SELECT 1; 200 times in a row, and measure the round-trip of the query for each one. You can easily write scripts to measure such latencies.

Additional details are provided in each of the results pages, linked earlier.

An invitation

We have no intention of misrepresenting others. Our aim is to show the world the excellent performance to be gained by running Postgres on PlanetScale Metal. In our testing, Postgres on PlanetScale Metal is by far the most performant.

We invite other vendors to provide feedback. If you see anything wrong in our benchmarking methodology, let us know at benchmarks@planetscale.com.