> ## Documentation Index
> Fetch the complete documentation index at: https://planetscale.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Discovery Tool

> Assess your PostgreSQL database and cloud infrastructure before migrating to PlanetScale Postgres.

export const HerokuIcon = () => <svg viewBox="0 0 128 128" fill="none" xmlns="http://www.w3.org/2000/svg">
    <path fill="currentColor" d="M102.1 2H25.9C19.3 2 14 7.3 14 13.9v100.3c0 6.6 5.3 11.9 11.9 11.9h76.3c6.6 0 11.9-5.3 11.9-11.9V13.9C114 7.3 108.7 2 102.1 2zM37 108.7V80.5l14.1 14.1L37 108.7zm53 .3H76.9l.1-.2V59.4s3.1-11.8-39.7 4.8c-.1.2-.2-45.7-.2-45.7l13.9-.1v29.4s39-15.4 39 11.7V109zm-5.2-73H70c5.3-6 10.2-17 10.2-17h15.3s-2.6 7-10.7 17z" />
  </svg>;

The PlanetScale Discovery Tool analyzes your existing PostgreSQL database and cloud infrastructure to help plan your migration to PlanetScale Postgres. It collects metadata about your database configuration, schema structure, performance characteristics, security settings, and cloud resources. It never reads or stores actual table data.

The tool produces a structured JSON report that PlanetScale uses to provide migration guidance tailored to your environment.

<Note>
  The discovery tool is open source and available on [GitHub](https://github.com/planetscale/ps-discovery). The documentation below covers the essentials. See the [full documentation](https://github.com/planetscale/ps-discovery/tree/main/docs) in the repository for advanced usage, troubleshooting, and detailed reference.
</Note>

## What it discovers

**Database analysis:**

* PostgreSQL version, configuration, and installed extensions
* Schema structure, including tables, columns, indexes, constraints, and sizes
* Performance statistics such as cache hit ratios, table/index usage, and active locks
* Security configuration: roles, permissions, and SSL settings
* Feature usage: foreign data wrappers, partitioning, PostGIS, and more

**Cloud infrastructure analysis:**

* Database instances, clusters, and their configurations
* VPC networking, subnets, security groups, and firewall rules
* Performance metrics from cloud monitoring services
* High availability and replica configurations

## Installation

The discovery tool requires Python 3.9 or later.

<Steps>
  <Step title="Download and extract">
    Download the latest release from [GitHub](https://github.com/planetscale/ps-discovery/releases) and extract it:

    ```bash theme={null}
    tar -xzf ps-discovery-*.tar.gz
    cd ps-discovery
    ```
  </Step>

  <Step title="Run setup">
    The setup script verifies your Python version, creates a virtual environment, and installs dependencies:

    ```bash theme={null}
    ./setup.sh
    ```
  </Step>

  <Step title="Configure credentials">
    Copy the sample configuration file and edit it to include your database and cloud provider credentials:

    ```bash theme={null}
    cp sample-config.yaml config.yaml
    ```

    At a minimum, you need to configure your database connection. See [Configuration](#configuration) below for the full format.
  </Step>
</Steps>

Alternatively, you can install with `pipx` for a cleaner setup:

```bash theme={null}
# Install with support for a specific cloud provider
pipx install -e ".[aws]"

# Or install with all provider support
pipx install -e ".[all]"
```

## Database user setup

Create a dedicated read-only user for the discovery tool. Connect to your database as a superuser or privileged role and run the following:

```sql theme={null}
-- Create a dedicated user for database discovery
CREATE USER planetscale_discovery WITH PASSWORD 'secure_password_here';

-- Grant basic connection and usage permissions
GRANT CONNECT ON DATABASE your_database TO planetscale_discovery;
GRANT USAGE ON SCHEMA public TO planetscale_discovery;
GRANT USAGE ON SCHEMA information_schema TO planetscale_discovery;

-- Grant read access to all tables and views
GRANT SELECT ON ALL TABLES IN SCHEMA public TO planetscale_discovery;
GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO planetscale_discovery;
GRANT SELECT ON ALL TABLES IN SCHEMA pg_catalog TO planetscale_discovery;

-- Grant permissions for system catalogs and statistics
GRANT SELECT ON pg_stat_database TO planetscale_discovery;
GRANT SELECT ON pg_stat_user_tables TO planetscale_discovery;
GRANT SELECT ON pg_stat_user_indexes TO planetscale_discovery;
GRANT SELECT ON pg_stat_activity TO planetscale_discovery;
GRANT SELECT ON pg_stat_replication TO planetscale_discovery;
GRANT SELECT ON pg_settings TO planetscale_discovery;
GRANT SELECT ON pg_database TO planetscale_discovery;
GRANT SELECT ON pg_user TO planetscale_discovery;
GRANT SELECT ON pg_roles TO planetscale_discovery;
GRANT SELECT ON pg_user_mappings TO planetscale_discovery;

-- For foreign data wrapper analysis
GRANT SELECT ON pg_foreign_server TO planetscale_discovery;
GRANT SELECT ON pg_foreign_data_wrapper TO planetscale_discovery;

-- For advanced performance analysis (if pg_stat_statements is enabled)
GRANT SELECT ON pg_stat_statements TO planetscale_discovery;

-- For replication analysis
GRANT SELECT ON pg_stat_wal_receiver TO planetscale_discovery;
GRANT SELECT ON pg_stat_subscription TO planetscale_discovery;

-- For PostgreSQL 10+ enhanced privileges (recommended)
GRANT pg_read_all_stats TO planetscale_discovery;
GRANT pg_read_all_settings TO planetscale_discovery;
```

<Note>
  If your database has additional schemas beyond `public`, repeat the `GRANT USAGE ON SCHEMA` and `GRANT SELECT ON ALL TABLES IN SCHEMA` statements for each schema you want analyzed.
</Note>

### Cleanup

<Warning>
  After discovery is complete, remove the `planetscale_discovery` user from your database. This user has read access to your schema and system catalogs and should not be left in place.
</Warning>

```sql theme={null}
DROP USER IF EXISTS planetscale_discovery;
```

If the user owns any objects, reassign ownership first:

```sql theme={null}
REASSIGN OWNED BY planetscale_discovery TO postgres;
DROP OWNED BY planetscale_discovery;
DROP USER planetscale_discovery;
```

## Configuration

The discovery tool uses a YAML configuration file. Here is an example with the most common options:

```yaml theme={null}
modules:
  - database # Run database analysis
  - cloud    # Run cloud infrastructure analysis (optional)

database:
  host: your-db-host.example.com
  port: 5432
  database: your_database
  username: planetscale_discovery
  password: secure_password_here
  sslmode: require

providers:
  aws:
    enabled: true
    regions:
      - us-east-1
  gcp:
    enabled: false
  supabase:
    enabled: false
  heroku:
    enabled: false

output:
  output_dir: ./discovery_output
```

## Running discovery

Run database-only analysis:

```bash theme={null}
./ps-discovery database --config config.yaml
```

Run both database and cloud analysis:

```bash theme={null}
./ps-discovery both --config config.yaml
```

The tool produces a `planetscale_discovery_results.json` file in your configured output directory. Share this report with PlanetScale for migration planning assistance.

<Note>
  Once the discovery is complete, remember to [clean up](#cleanup) the `planetscale_discovery` user you created on your source database.
</Note>

## Cloud provider setup

Each cloud provider requires specific credentials and permissions. Below is a summary of what you need for each. For detailed instructions including IAM policies and API enablement steps, see the [provider documentation](https://github.com/planetscale/ps-discovery/tree/main/docs/providers).

### AWS (RDS / Aurora)

The tool discovers RDS instances, Aurora clusters, VPC networking, security groups, and CloudWatch metrics.

**Authentication** (choose one):

* IAM instance profile (recommended when running on EC2)
* Access key and secret key
* IAM role assumption (for cross-account access)

**Required permissions:**

* RDS: `DescribeDBInstances`, `DescribeDBClusters`, `DescribeDBSubnetGroups`
* EC2: `DescribeVpcs`, `DescribeSubnets`, `DescribeSecurityGroups`, `DescribeRouteTables`
* CloudWatch: `GetMetricStatistics`, `ListMetrics`

**Configuration:**

```yaml theme={null}
providers:
  aws:
    enabled: true
    regions:
      - us-east-1
      - us-west-2
    # Authentication - choose one approach:
    # Option 1: Use instance profile or environment variables (recommended)
    # Option 2: Explicit credentials
    access_key_id: AKIA...
    secret_access_key: ...
```

### Google Cloud (Cloud SQL / AlloyDB)

The tool discovers Cloud SQL instances, AlloyDB clusters, VPC networks, firewall rules, and Cloud Monitoring metrics.

**Authentication** (choose one):

* Application Default Credentials (recommended)
* Service account key file

**Required APIs** (must be enabled in your project):

* Cloud SQL Admin API
* Compute Engine API
* Cloud Monitoring API
* AlloyDB API (if using AlloyDB)

**Configuration:**

```yaml theme={null}
providers:
  gcp:
    enabled: true
    project_id: your-project-id
    # Optional: path to service account key
    credentials_file: /path/to/service-account-key.json
```

### Supabase

The tool discovers project metadata, database configuration, PgBouncer settings, and connection details.

**Authentication:**

* [Personal Access Token](https://supabase.com/dashboard/account/tokens) (recommended, read-only)

**Configuration:**

```yaml theme={null}
providers:
  supabase:
    enabled: true
    access_token: sbp_...
```

### Heroku

The tool discovers Heroku Postgres add-ons across all your apps, including plan details, database sizes, replica configurations, and connection pooling.

**Authentication:**

* API key from the [Heroku dashboard](https://dashboard.heroku.com/account)
* Or a Heroku CLI authorization token

**Configuration:**

```yaml theme={null}
providers:
  heroku:
    enabled: true
    api_key: your-heroku-api-key
```

## Performance and safety

The default database analyzers are safe to run against production databases. They use read-only queries against system catalogs and statistics views, with very low performance impact.

<Warning>
  The optional **data size analyzer** performs sampling queries against actual tables and can have a significant performance impact on large databases. If you need this analysis, consider running it against a read replica and starting with a low sampling percentage. This analyzer is disabled by default and must be explicitly opted into via configuration.
</Warning>

## Privacy and security

The discovery tool runs entirely on your infrastructure. No data is sent to external services during analysis.

**Collected:** Schema metadata, database configuration, usage statistics, infrastructure topology, and role names.

**Not collected:** Table contents, row data, application queries, passwords, or secrets. Passwords are used only to establish the database connection and are never included in the output.

## Next steps

Once you have your discovery report, [share it with us](https://planetscale.com/contact) if you want tailored migration guidance. You can also follow one of our migration guides on your own:

<Columns cols={2}>
  <Card title="Migrate using pgdump/restore" icon="recycle" horizontal href="/postgres/imports/postgres-migrate-dumprestore" />

  <Card title="Migrate using WAL streaming" icon="laptop" horizontal href="/postgres/imports/postgres-migrate-walstream" />

  <Card title="Migrate using Amazon DMS" icon="aws" horizontal href="/postgres/imports/postgres-migrate-dms" />

  <Card title="Migrate from Heroku" icon={<HerokuIcon />} horizontal href="/postgres/imports/heroku" />
</Columns>

## Need help?

Get help from [the PlanetScale Support team](https://planetscale.com/contact?initial=support), or join our [Discord community](https://pscale.link/community) to see how others are using PlanetScale.
