Skip to main content

Documentation Index

Fetch the complete documentation index at: https://planetscale.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

The PlanetScale Discovery Tool analyzes your existing MySQL-compatible database and cloud infrastructure to help plan your migration to PlanetScale Vitess. It collects metadata about your database configuration, schema structure, performance characteristics, replication topology, security settings, feature usage, and cloud resources. It never reads or stores actual table data. The Discovery CLI also supports PostgreSQL discovery. See the Postgres Discovery Tool guide for PlanetScale Postgres-specific details. The tool produces a structured JSON report that PlanetScale uses to provide migration guidance tailored to your environment.
The discovery tool is open source and available on GitHub. The documentation below covers the essentials. See the full documentation in the repository for advanced usage, troubleshooting, and detailed reference.

What it discovers

Database analysis:
  • MySQL version, distribution, configuration, server variables, and cloud platform
  • Schema structure, including databases, tables, columns, indexes, constraints, views, stored routines, triggers, and partitioning
  • Performance statistics such as global status counter rates, process list summaries, InnoDB lock counters, and deadlock detection
  • Replication configuration, including replica status, binary log inventory, binary log retention settings, and binary log format
  • Security configuration, including SSL/TLS status, authentication plugin distribution, password policy settings, and aggregate privilege summaries
  • Feature usage: full-text indexes, geospatial data types, foreign key constraints, table partitioning, InnoDB compression, XA transactions, prepared statements, Galera Cluster, and more
Cloud infrastructure analysis:
  • Database instances, clusters, and their configurations
  • RDS instances, Aurora clusters, and Cloud SQL instances
  • VPC networking, subnets, security groups, firewall rules, and private connectivity
  • Performance metrics from cloud monitoring services
  • High availability and replica configurations

Installation

The discovery tool requires Python 3.9 or later.
1

Download and extract

Download the latest release from GitHub and extract it:
tar -xzf ps-discovery-*.tar.gz
cd ps-discovery
2

Run setup

The setup script verifies your Python version, creates a virtual environment, installs dependencies, and prompts you to install MySQL and cloud provider support:
./setup.sh
3

Configure credentials

Copy the sample configuration file and edit it to include your database and cloud provider credentials:
cp sample-config.yaml config.yaml
At a minimum, you need to configure your MySQL connection. See Configuration below for the full format.
Alternatively, you can install with pipx for a cleaner setup:
# Install with MySQL support
pipx install -e ".[mysql]"

# Or install with MySQL, AWS, and GCP support
pipx install -e ".[mysql,aws,gcp]"

Database user setup

Create a dedicated read-only user for the discovery tool. Connect to your MySQL-compatible database as a privileged user and run the following:
-- Create a dedicated user for database discovery
CREATE USER 'planetscale_discovery'@'%' IDENTIFIED BY 'secure_password_here';

-- Grant read access for schema analysis
GRANT SELECT ON *.* TO 'planetscale_discovery'@'%';

-- Grant process privilege for performance analysis and SHOW PROCESSLIST
GRANT PROCESS ON *.* TO 'planetscale_discovery'@'%';

-- Grant replication client for binary log and replica status analysis
GRANT REPLICATION CLIENT ON *.* TO 'planetscale_discovery'@'%';

-- Apply changes
FLUSH PRIVILEGES;
On Amazon RDS, Aurora MySQL, Google Cloud SQL for MySQL, MariaDB, and Percona Server, create the user through your administrative database user. Some managed services restrict access to certain system tables. The discovery tool reports those gaps and continues with the data it can collect.

PlanetScale and Vitess credentials

For PlanetScale databases, your existing branch credentials are sufficient. The discovery tool automatically detects PlanetScale and Vitess environments and adapts its queries.
ps-discovery database --engine mysql \
  --host aws.connect.psdb.cloud \
  --username your_branch_username \
  -W \
  --ssl-mode required
Or via config:
engine: mysql

mysql:
  host: aws.connect.psdb.cloud
  port: 3306
  username: your_branch_username
  password: your_branch_password
  ssl_mode: required
PlanetScale and Vitess-specific behavior:
  • The tool detects scoped information_schema in Vitess and automatically falls back to per-database iteration
  • System databases such as _vt, mysql, and performance_schema are excluded
  • Features not supported by Vitess are detected and reported

MySQL cleanup

After MySQL discovery is complete, remove the planetscale_discovery user from your database. This user has read access to your schema and system metadata and should not be left in place.
DROP USER IF EXISTS 'planetscale_discovery'@'%';

Configuration

The discovery tool uses a YAML configuration file. Here is an example with the most common options:
engine: mysql

modules:
  - database # Run database analysis
  - cloud    # Run cloud infrastructure analysis (optional)

mysql:
  host: your-db-host.example.com
  port: 3306
  database: "" # Leave empty to discover all databases
  username: planetscale_discovery
  password: secure_password_here
  ssl_mode: required

providers:
  aws:
    enabled: true
    regions:
      - us-east-1
  gcp:
    enabled: false

output:
  output_dir: ./mysql_discovery_output
Set database to a specific database name to focus analysis on one database:
engine: mysql

mysql:
  host: your-db-host.example.com
  port: 3306
  database: my_application_db
  username: planetscale_discovery
  password: secure_password_here
  ssl_mode: required

Running discovery

Run MySQL database-only analysis:
./ps-discovery database --engine mysql --config config.yaml
Run both database and cloud analysis:
./ps-discovery both --engine mysql --config config.yaml
Run only specific MySQL analyzers:
# Schema and features only
./ps-discovery database --engine mysql --config config.yaml \
  --analyzers schema,features

# Performance and replication only
./ps-discovery database --engine mysql --config config.yaml \
  --analyzers performance,replication
Available MySQL analyzers are config, schema, performance, replication, security, and features. The tool produces a planetscale_discovery_results.json file in your configured output directory. Share this report with PlanetScale for migration planning assistance.
Once MySQL discovery is complete, remember to clean up the planetscale_discovery user you created on your source database.

Cloud provider setup

Each cloud provider requires specific credentials and permissions. Below is a summary of what you need for each. For detailed instructions including IAM policies and API enablement steps, see the provider documentation.

AWS (RDS / Aurora)

The tool discovers RDS instances, Aurora clusters, VPC networking, security groups, and CloudWatch metrics. Authentication (choose one):
  • AWS profile
  • IAM instance profile when running on EC2
  • Access key and secret key
  • IAM role assumption for cross-account access
Required permissions:
  • RDS: DescribeDBInstances, DescribeDBClusters, DescribeDBSubnetGroups, DescribeDBClusterParameterGroups, DescribeDBParameterGroups, DescribeOptionGroups
  • EC2: DescribeVpcs, DescribeSubnets, DescribeSecurityGroups, DescribeRouteTables, DescribeInternetGateways, DescribeNatGateways, DescribeVpcEndpoints
  • CloudWatch: GetMetricStatistics, ListMetrics
  • STS: GetCallerIdentity
Configuration:
providers:
  aws:
    enabled: true
    regions:
      - us-east-1
      - us-west-2
    discover_all: true
    # Authentication - choose one approach:
    # Option 1: Use an AWS profile
    credentials:
      profile: migration-discovery
    # Option 2: Assume a role
    # credentials:
    #   role_arn: "arn:aws:iam::123456789012:role/PlanetScaleDiscoveryRole"
    #   external_id: "unique-external-id"
You can also focus discovery on specific AWS resources:
providers:
  aws:
    enabled: true
    discover_all: false
    resources:
      rds_instances:
        - production-db-1
        - staging-db-1
      aurora_clusters:
        - prod-cluster
    regions:
      - us-east-1

Google Cloud (Cloud SQL)

The tool discovers Cloud SQL instances, VPC networks, firewall rules, and Cloud Monitoring metrics. Authentication (choose one):
  • Service account key file
  • Application Default Credentials
  • Environment variables
Required APIs (must be enabled in your project):
  • Cloud SQL Admin API
  • Compute Engine API
  • Cloud Monitoring API
Configuration:
providers:
  gcp:
    enabled: true
    project_id: your-project-id
    regions:
      - us-central1
      - us-east1
    discover_all: true
    credentials:
      service_account_key: /path/to/ps-discovery-key.json
You can also focus discovery on specific Google Cloud resources:
providers:
  gcp:
    enabled: true
    project_id: your-project-id
    discover_all: false
    resources:
      cloud_sql_instances:
        - prod-db-instance
        - staging-db-instance
    regions:
      - us-central1

Performance and safety

The default database analyzers are safe to run against production databases. They use read-only queries against system catalogs and statistics views, with very low performance impact.
The discovery tool can query metadata and statistics across every accessible database when the database field is empty. For environments with many databases or very large schemas, consider targeting one database at a time or running the tool against a replica.

Privacy and security

The discovery tool runs entirely on your infrastructure. No data is sent to external services during analysis. Collected: Schema metadata, database configuration, usage statistics, replication metadata, infrastructure topology, aggregate security information, and feature usage. Not collected: Table contents, row data, query text, slow query log entries, passwords, secrets, connection strings, application code, or individual grant details. Passwords are used only to establish the database connection and are never included in the output.

Next steps

Once you have your discovery report, share it with us if you want tailored migration guidance. You can also follow one of our migration guides on your own:

Database import workflow

Migrate from AWS RDS

Migrate from Amazon Aurora

Migrate from Google Cloud SQL

Need help?

Get help from the PlanetScale Support team, or join our Discord community to see how others are using PlanetScale.