Navigation

Postgres Imports - Amazon DMS

Amazon Database Migration Service (DMS) provides a managed migration service that can handle complex database migrations with built-in monitoring, error handling, and data validation. This method is ideal for large, complex databases that require robust migration capabilities.

Overview

This migration method involves:

  1. Setting up AWS DMS replication instance and endpoints
  2. Configuring source and target database connections
  3. Creating and running migration tasks with full load and CDC
  4. Monitoring migration progress and performing cutover

Note

This method requires an AWS account and will incur AWS DMS charges. Review AWS DMS pricing before proceeding.

Prerequisites

Before starting the migration:

  • Active AWS account with appropriate DMS permissions
  • Source PostgreSQL database accessible from AWS (consider VPC configuration)
  • Connection details for your PlanetScale for Postgres database from the console
  • Understanding of your data transformation requirements (if any)
  • Network connectivity between AWS and both source and target databases

Step 1: Set Up AWS DMS

Create DMS Replication Instance

  1. Navigate to the AWS DMS Console
  2. Click "Create replication instance"
  3. Configure the instance:
Name: planetscale-postgres-migration
Description: Migration to PlanetScale for Postgres
Instance class: dms.t3.medium (adjust based on your needs)
Engine version: Latest available
VPC: Select appropriate VPC
Multi-AZ: No (for cost savings, Yes for production)
Publicly accessible: Yes (if needed for connectivity)

Configure Security Groups

Ensure your replication instance can connect to:

  • Source PostgreSQL database (port 5432)
  • PlanetScale for Postgres (port 5432)
  • Internet for PlanetScale connectivity

Step 2: Create Source Endpoint

Configure PostgreSQL source endpoint:

  1. In DMS Console, go to "Endpoints" > "Create endpoint"
  2. Configure source endpoint:
Endpoint type: Source endpoint
Endpoint identifier: postgres-source
Source engine: postgres
Server name: your-postgres-host
Port: 5432
Database name: your-database-name
Username: your-username
Password: your-password

Advanced settings for PostgreSQL:

Extra connection attributes:
pluginName=test_decoding;
slotName=dms_slot_planetscale;
captureDDLs=false;
maxFileSize=32768;

Step 3: Create Target Endpoint

Configure PlanetScale for Postgres target endpoint:

  1. Create target endpoint with PlanetScale connection details:
Endpoint type: Target endpoint
Endpoint identifier: planetscale-target
Target engine: postgres
Server name: [from PlanetScale console]
Port: [from PlanetScale console]
Database name: [from PlanetScale console]
Username: [from PlanetScale console]
Password: [from PlanetScale console]

SSL Configuration:

SSL mode: require

Step 4: Test Endpoints

  1. Select your source endpoint and click "Test connection"
  2. Select your target endpoint and click "Test connection"
  3. Ensure both tests pass before proceeding

Step 5: Create Migration Task

Configure the migration task:

  1. Go to "Database migration tasks" > "Create task"
  2. Configure task settings:
Task identifier: postgres-to-planetscale
Replication instance: planetscale-postgres-migration
Source database endpoint: postgres-source
Target database endpoint: planetscale-target
Migration type: Migrate existing data and replicate ongoing changes

Task Settings

{
  "TargetMetadata": {
    "TargetSchema": "",
    "SupportLobs": true,
    "FullLobMode": false,
    "LobChunkSize": 0,
    "LimitedSizeLobMode": true,
    "LobMaxSize": 32,
    "InlineLobMaxSize": 0,
    "LoadMaxFileSize": 0,
    "ParallelLoadThreads": 0,
    "ParallelLoadBufferSize": 0,
    "BatchApplyEnabled": false,
    "TaskRecoveryTableEnabled": false,
    "ParallelApplyThreads": 0,
    "ParallelApplyBufferSize": 0,
    "ParallelApplyQueuesPerThread": 0
  },
  "FullLoadSettings": {
    "TargetTablePrepMode": "DROP_AND_CREATE",
    "CreatePkAfterFullLoad": false,
    "StopTaskCachedChangesApplied": false,
    "StopTaskCachedChangesNotApplied": false,
    "MaxFullLoadSubTasks": 8,
    "TransactionConsistencyTimeout": 600,
    "CommitRate": 10000
  },
  "Logging": {
    "EnableLogging": true,
    "LogComponents": [
      {
        "Id": "TRANSFORMATION",
        "Severity": "LOGGER_SEVERITY_DEFAULT"
      },
      {
        "Id": "SOURCE_UNLOAD",
        "Severity": "LOGGER_SEVERITY_DEFAULT"
      },
      {
        "Id": "TARGET_LOAD",
        "Severity": "LOGGER_SEVERITY_DEFAULT"
      }
    ]
  },
  "ControlTablesSettings": {
    "historyTimeslotInMinutes": 5,
    "ControlSchema": "",
    "HistoryTimeslotInMinutes": 5,
    "HistoryTableEnabled": false,
    "SuspendedTablesTableEnabled": false,
    "StatusTableEnabled": false,
    "FullLoadExceptionTableEnabled": false
  }
}

Step 6: Configure Table Mappings

Basic table mapping (migrate all tables):

{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "public",
        "table-name": "%"
      },
      "rule-action": "include",
      "filters": []
    }
  ]
}

Advanced table mapping with transformations:

{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-name": "1",
      "object-locator": {
        "schema-name": "public",
        "table-name": "%"
      },
      "rule-action": "include"
    },
    {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-target": "schema",
      "object-locator": {
        "schema-name": "public"
      },
      "rule-action": "rename",
      "value": "public"
    }
  ]
}

Step 7: Start Migration Task

  1. Review all task configurations
  2. Click "Create task" to start the migration
  3. Monitor the task status in the DMS console

Step 8: Monitor Migration Progress

Key metrics to monitor:

  • Full load progress: Percentage of tables loaded
  • CDC lag: Latency between source and target
  • Error count: Any migration errors
  • Throughput: Records per second

Using CloudWatch:

Set up CloudWatch alarms for:

  • High CDC latency
  • Migration errors
  • Task failures
# CLI command to check task status
aws dms describe-replication-tasks \
    --filters Name=replication-task-id,Values=your-task-id

Step 9: Verify Data Migration

Check table counts and data integrity:

-- Run on both source and target databases
SELECT 
    schemaname, 
    tablename, 
    n_tup_ins as estimated_rows,
    n_tup_upd as updated_rows,
    n_tup_del as deleted_rows
FROM pg_stat_user_tables 
ORDER BY schemaname, tablename;

Validate specific data:

-- Compare checksums for critical tables
SELECT count(*), md5(string_agg(column_name::text, '')) 
FROM your_important_table 
ORDER BY primary_key;

Step 10: Prepare for Cutover

Monitor CDC lag:

Ensure CDC latency is minimal (under 5 seconds) before cutover:

-- Check DMS validation status
SELECT * FROM awsdms_validation_failures_v1;

Test application connectivity:

  1. Create a read-only connection to PlanetScale for Postgres
  2. Test critical application queries
  3. Verify performance is acceptable

Step 11: Perform Cutover

When ready to switch to PlanetScale for Postgres:

  1. Stop application writes to source database
  2. Wait for CDC to catch up (monitor lag in DMS console)
  3. Verify final data consistency
  4. Update application connection strings to point to PlanetScale
  5. Start application with new connections
  6. Stop DMS task once satisfied with cutover

Stop the migration task:

aws dms stop-replication-task \
    --replication-task-arn arn:aws:dms:region:account:task:task-id

Step 12: Cleanup

After successful cutover:

  1. Delete DMS task
  2. Delete replication instance (to stop charges)
  3. Remove source and target endpoints
  4. Clean up security groups if created specifically for migration
# Cleanup commands
aws dms delete-replication-task --replication-task-arn your-task-arn
aws dms delete-replication-instance --replication-instance-arn your-instance-arn
aws dms delete-endpoint --endpoint-arn your-source-endpoint-arn
aws dms delete-endpoint --endpoint-arn your-target-endpoint-arn

Troubleshooting

Common Issues:

Connectivity problems:

  • Check security groups and network ACLs
  • Verify endpoint configurations
  • Test network connectivity from replication instance

Performance issues:

  • Increase replication instance size
  • Adjust parallel load settings
  • Monitor source database performance

Data type mapping issues:

Large object (LOB) handling:

{
  "TargetMetadata": {
    "SupportLobs": true,
    "FullLobMode": true,
    "LobChunkSize": 32768,
    "LimitedSizeLobMode": false
  }
}

Performance Optimization:

  1. Parallel loading: Increase MaxFullLoadSubTasks
  2. Batch apply: Enable for better target performance
  3. Memory allocation: Increase replication instance size
  4. Network optimization: Use placement groups for better network performance

Cost Optimization

  • Instance sizing: Start with smaller instances and scale up if needed
  • Multi-AZ: Disable for dev/test migrations
  • Task lifecycle: Delete resources immediately after successful migration
  • Data transfer: Consider AWS region placement to minimize transfer costs

Schema Considerations

Before migration, review:

Next Steps

After successful migration:

  1. Set up monitoring and alerting for PlanetScale for Postgres
  2. Optimize application performance with the new database
  3. Plan for ongoing maintenance and backup strategies
  4. Consider implementing additional PlanetScale features

For simpler migrations, consider pg_dump/restore or WAL streaming methods.

If you encounter issues during migration, please reach out to support for assistance.

Need help?

Get help from the PlanetScale Support team, or join our GitHub discussion board to see how others are using PlanetScale.