PlanetScale for Postgres is now GA. Get started
Navigation

Blog|Engineering

Benchmarking Postgres 17 vs 18

By Ben Dicken |

Postgres 18 released a few weeks ago, and there's plenty of hype around the improvements it's bringing. Most notably, Postgres 18 introduces the io_method configuration option, allowing users more control over how disk I/O is handled.

Setting this to sync results in the same behavior as 17 and earlier versions. With this, all I/O happens via synchronous requests.

18 introduces two alternatives: worker and io_uring. worker (the new default) causes Postgres to use dedicated background worker processes to handle all I/O operations. io_uring is the change many are excited about for performance reasons, as it uses the Linux io_uring interface to allow all disk reads to happen asynchronously. The hope is that this can lead to significantly better I/O performance.

We conducted detailed benchmarks to compare the performance on Postgres 17 and 18. Let's see if these improvements are all they're hyped up to be.

Benchmark configurations

sysbench was used for executing the benchmarks. The io_uring improvements only apply to reads, so the focus here will be on the oltp_read_only benchmark. This includes both point selects and queries that do range scans and aggregations controlled by the --range_size argument. Though it would be interesting to also benchmark write and read/write combo performance as well, sticking to read-only helps focus this discussion. We set the data size to be TABLES=100 and SCALE=13000000 which produces a ~300 GB database (100 tables with 13 million rows each).

The benchmarks were conducted on four different EC2 instance configurations:

InstancevCPUsRAMDiskDisk typeIOPSThroughput
r7i.2xlarge864 GB700 GBgp33,000125 MB/s
r7i.2xlarge864 GB700 GBgp310,000500 MB/s
r7i.2xlarge864 GB700 GBio216,000-
i7i.2xlarge864 GB1,875 GBNVMe300,000-

All these instances run on the same (or extremely similar) Intel CPUs. We include the i7i instance to show what Postgres is capable of with fast, local NVMe drives. This is what we use for PlanetScale Metal and have seen amazing performance results for Postgres 17. Many other cloud providers only provide a form of network-attached storage, whereas we provide both options.

Each server is warmed with 10 minutes of query load prior to benchmarking.

On each one of these configurations, we ran the sysbench oltp_read_only benchmark with the following configurations for 5 minutes each:

  • Single connection and --range_size = 100
  • 10 connections and --range_size = 100
  • 50 connections and --range_size = 100
  • Single connection and --range_size = 10,000
  • 10 connections and --range_size = 10,000
  • 50 connections and --range_size = 10,000

This leads to a total of 24 unique 5-minute benchmark runs. These 24 configurations were run four times each! Once on Postgres 17 and once on each of Postgres 18 with io_method=worker, io_method=io_uring and io_method=sync. This makes a total of 96 benchmark combinations. Go ahead, tell me I'm crazy!

This is an extremely I/O-intensive workload. The data size (300 GB) far exceeds RAM size (64 GB), so there will be significant disk accesses for the queries being executed here.

Single connection

Though a single connection is an unrealistic production workload, it offers a baseline for how the different I/O settings affect straight-line performance. Let's assess the QPS we can achieve here.

Below is the average QPS for all single-connection runs where the --range_size value is set to the default of 100. This means that the full read workload is composed of a combination of point-select queries, and queries that do scans / aggregations of 100-row sequences.

QPS benchmark results on 1 connection and scan=100

A few things are clear:

  1. On network-attached storage (gp3, io2) Postgres 18 in sync and worker modes perform noticeably better than 17 and 18 with io_uring. I'll admit, this surprised me! My expectation was that io_uring would perform as well as if not better than all these options.
  2. The latency of gp3 and even io2 is clearly a factor in this difference. On an instance with a low-latency local NVME drive, all options are much more evenly matched.
  3. The latency / IOPS of gp3 and even the very expensive io2 drive are a limiting factor. Local disks outperform in all configurations.
  4. For straight-line performance with short-lived queries, Postgres 18 is much faster. Welcome improvements!

Here is the same test except with --range_size=10000. This means the workload has much larger scans / aggregates, which means more sequential I/O and lower QPS:

QPS benchmark results on 1 connection and scan=10000

Local disks still clearly outperform, but the difference between the other three options is less stark. This is due to a combination of (a) more sequential I/O but more importantly (b) more CPU work (aggregating 10k rows is more CPU-intensive than aggregating 100 rows). Additionally, the delta between postgres 17 and 18 is much smaller.

Below is an interactive visual comparing the Postgres 17 results for all instance types with the best performer on Postgres 18, workers. Click on the thumbnails to add or remove lines from the graph and compare various combinations. This is based on a 10-second sample rate.

High concurrency

In real-world scenarios, we have many simultaneous connections and many reads happening at once. Let's look at how each of these servers handles the same benchmark but with much higher load across 50 connections.

Of course, oltp_read_only does not capture a realistic OLTP workload, especially since it does not include writes, but we use it as a proxy for workloads with high read demands. Below, we show the average QPS for all of the 50-connection oltp_read_only with --range_size=100.

QPS benchmark results on 50 connections and scan=100

Now with a high level of parallelism and increased I/O demand, several additional things are clear:

  1. IOPS and throughput are clear bottlenecks for each of the EBS-backed instances. The different versions / I/O settings don't make a huge difference in such cases.
  2. As we increase the EBS capabilities, the QPS grows in lockstep, and the local-NVME instance outperforms them all.
  3. Postgres 18 with sync and worker have the best performance on all the EBS-backed instances by a small margin.

Again, the same benchmark but with --range_size=10000.

QPS benchmark results on 50 connections and scan=10000

The gp3-10k and io2-16k instances get much closer to local-disk performance. However, this is because we have made the benchmark much more CPU-bound vs I/O-bound, so the low latency of local disks gives less of an advantage (though still the best!) But importantly, we finally have a scenario where io_uring wins! On the NVMe instance, it slightly outperforms the other options.

Below we again compare these results for Postgres 17 and Postgres 18, workers. Click on the thumbnails to add or remove lines from the graph and compare various combinations.

Moderate concurrency

These same benchmarks were also executed with 10 concurrent connections. The results are pretty similar so they will not all be shown, but I do want to point out this graph where --range_size=100:

QPS benchmark results on 10 connections and scan=100

Look carefully at the first bar group (for gp3-3k). The io_uring setting performed significantly worse than the rest. But if you look at that same part of the graph when there were 50 connections, io_uring performs only slightly worse than the rest. To me, this indicates that io_uring performs well when there's lots of I/O concurrency, but in low-concurrency scenarios it isn't as beneficial.

Cost

Cost should always be a consideration when comparing infrastructure setups. Here are the on-demand costs of each server configuration in AWS:

  • r7i with gp3 3k IOPS and 125 Mbps: $442.32/mo
  • r7i with gp3 10k IOPS and 500 Mbps: $492.32/mo
  • r7i with io2 16k IOPS: $1,513.82/mo
  • i7i with local NVMe (no EBS): $551.15/mo

And keep in mind, the first three only have 700 GB of storage, whereas the i7i has a 1.8 TB volume! The server with a local NVMe disk is the clear price-performance winner.

Why isn't io_uring the winner?

Given my excitement over the new io_uring capabilities of Postgres 18, I was expecting it to win in many more scenarios. So what's going on here?

For one, this is a very specific type of workload. It is read only, and does a combination of point-selects, range scans, and range aggregations. io_uring surely has other workloads where it would shine. It's also possible that with different postgresql.conf tunings, we'd see improvements from io_uring.

While writing this, I stumbled across Tomas Vondra's excellent blog discussing the new io_method options, how to tune them, and the pros and cons of each. He makes several good points regarding why workers would outperform io_uring, and I recommend you read it. In short:

  • Index scans don't (yet) use AIO.
  • Though the I/O happens in the background with io_uring, the checksums / memcpy can still be a bottleneck.
  • workers allows better parallelism for I/O from the perspective of a single process.

So there are legitimate cases where io_uring won't always perform better! I'd love to see further benchmarks from others, spanning other workload types on configurations. You can find many of the configs used for these tests in the appendix.

Conclusions

Though narrow, this was a fun experiment to compare performance of Postgres versions and I/O settings. My key takeaways are:

  • Postgres 18 brings nice I/O improvements and configuration flexibility. Great job to the maintainer team!
  • Local disks are the clear winner. When you have low-latency I/O and immense IOPS, the rest matters less. This is why PlanetScale Metal makes for best-in-class database performance.
  • Using io_method=worker was a good choice as the new default. It comes with a lot of the "asynchronous" benefits of io_uring without relying on that specific kernel interface, and can be tuned by setting io_workers=X.
  • There's no one-size-fits-all best I/O configuration.
  • Though they do benefit, the new workers I/O configuration doesn't help network-attached storage scenarios as much as one might hope.

What else do you want to see benchmarked? Reach out to let us know.

Appendix: configuration

Here are a selection of the critical custom-tuned Postgres configs used for this benchmark:

shared_buffers = 16GB          # 25% of RAM
effective_cache_size = 48GB    # 75% of RAM
work_mem = 64MB
maintenance_work_mem = 2GB

wal_level = replica
max_wal_size = 16GB
min_wal_size = 2GB
wal_buffers = 16MB
checkpoint_completion_target = 0.9

random_page_cost = 1.1
effective_io_concurrency = 200
default_statistics_target = 100

max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
max_parallel_maintenance_workers = 4

bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2.0

autovacuum = on
autovacuum_max_workers = 4
autovacuum_naptime = 10s
autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.025

logging_collector = on
...more log configs...

shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 2048
track_io_timing = on

jit = on

# io_workers left at default = 3