We've seen that as Rails applications grow, there are a few common issues that teams run into with deleting data.
In this post, you'll learn a few strategies you can use to mitigate the risks of cleaning up data on a high scale Rails application.
Rails applications at scale generally run into issues when deleting many records at once. This happens most commonly in models with many associations.
The standard way to delete associated data in Rails is to let ActiveRecord handle it via
dependent: :destroy. In the following example, when the parent model (
author) is deleted, all data in the dependent models will get deleted by ActiveRecord as well.
class Author < ApplicationRecord has_many :books, dependent: :destroy end class Book < ApplicationRecord belongs_to :author end
The database schema looks like this:
ActiveRecord::Schema[7.1].define(version: 2022_06_06_171750) do create_table "authors", force: true do |t| t.string "name" t.datetime "created_at" t.datetime "updated_at" end create_table "books", force: true do |t| t.string "name" t.text "description" t.bigint "author_id", null: false t.datetime "created_at" t.datetime "updated_at" t.index ["author_id"], name: "index_books_on_author_id" end end
There is an indexed foreign key, but no foreign key constraint. ActiveRecord is responsible for deleting the data.
Now that we've covered the typical way to delete associated data, let's look at some tips to improve this.
As of Rails 6.1,
dependent: :destroy_async was added to ActiveRecord. It works similarly to
dependent: :destroy, except that it will run the deletion via a background job rather than happening in request.
This protects you from triggering a large number of deletes within a single transaction. As a Rails application grows, it can be very easy to unintentionally delete a parent record and trigger a cascade of thousands of deletions. Having all of this happen within a request can lead to timeouts and added strain on your database.
Replace any usage of Foreign Key
ON DELETE CASCADE
Foreign key constraints are used in databases to manage referential integrity between tables. Specifically, developers will use
ON DELETE CASCADE to delete all associated records when the parent record is deleted. This is an option some Rails applications will use rather than the standard
This works well when the child data is limited. It becomes a problem when deleting a large number of child records. A simple delete can suddenly turn into a massive operation deleting thousands of records across multiple tables. This results in the users
DELETE request taking several seconds to respond or timing out. In the database, this can lead to excessive locking, increase replication lag, and more issues that will have impact on other parts of the application.
We recommend replacing any usage of foreign key constraints with
:destroy_async for safer deletes at scale.
Look out for failing validations
One issue to look out for with
destroy_async is the risk of validations failing in a child model when deleting data. Since it's happening asynchronously, the user will be unaware of any errors and the job will end up in your error queue. If any child records have validations on delete, we recommend running them from the parent model. This will stop the deletion from occurring and alert the user of the issue. This is an important area to add test coverage to protect from any regressions.
We have two primary methods for deleting data,
destroy, as well as their related
destroy_all on ActiveRecord relations.
destroy— Deletes the record while also triggering the models callbacks
delete— Skips the callbacks and deletes the record directly from the database
If you have callbacks setup, then you'll generally want to always use
destroy so that they are called. It's important though to be aware of all the activity that could be caused by those callbacks, especially when destroying a large number of records. For example, a cron job for cleaning up old data would be better suited for using
delete_all to skip callbacks.
When there is no use for data anymore, it's a common practice to archive or delete it.
For large busy tables, deleting a large number of records at once can lock the table and have unintended consequences to the rest of the application. The safe way is to continuously run deletes in small batches.
Here is an example Sidekiq job that can be scheduled by a cron to run once per hour:
# frozen_string_literal: true class DeleteOldDataJob < BaseJob # We only want 1 instance of this job running at a time sidekiq_options unique_for: 1.hour, unique_until: :start, queue: :background, retry: false def perform(limit: 500) # Deletes 500 records deleted = Model.where("created_at < ?", 3.months.ago).limit(limit).delete_all # If more records to delete, requeue itself and run again if deleted == limit self.class.perform_async end end end
This example is making use of Sidekiq's unique jobs. This protects us from having several of these jobs running concurrently (which could result in deadlocks). If you are using a job system without uniqueness, an alternative is setting up a queue with a concurrency of 1 and running the cleanup job there.
How to test it
This job is a good place to add test coverage to ensure you're deleting the correct data. Here's an example pattern you can use.
# frozen_string_literal: true require "test_helper" class DeleteOldDataJobTest < ActiveJob::TestCase test "deletes data over 3 months old" do expired = create(:data, minute: 3.months.ago - 1.hour) retained = create(:data, minute: 3.months.ago + 1.hour) DeleteOldDataJob.new.perform assert Data.where(id: expired.id).empty? assert Data.where(id: retained.id).exists? end test "requeues if more to delete" do create(:data, minute: 3.months.ago - 1.hour) create(:data, minute: 3.months.ago - 1.hour) assert_enqueued_sidekiq_jobs(1, only: DeleteOldDataJob) do DeleteOldDataJob.new.perform(limit: 1) end end end
If you do make a mistake while deleting data or making schema changes, PlanetScale offers some solutions. With Rewind, you can instantly revert a bad schema change. We also throw a warning in the dashboard if you're dropping a table that was recently queried, so you can hopefully catch any mistakes before they happen.
To get started with Rails and PlanetScale, check out the Rails quickstart.