Skip to main content

What is logical replication?

Logical replication is a PostgreSQL feature that replicates data objects and their changes at the logical level, rather than the physical level. It works by:
  1. WAL Level: The database must be configured with wal_level = logical to capture logical changes
  2. Publication: Creates a logical replication stream on the source database
  3. Replication Slot: Maintains position in the WAL stream and ensures data consistency
  4. Subscription/Consumer: External tools consume the logical replication stream
This setup enables:
  • Replication of individual transactions and row changes
  • Selective replication of specific tables or databases
  • Cross-version replication between different PostgreSQL versions
  • CDC integration with external tools and data pipelines

Configuration requirements

PlanetScale cluster parameters

To enable logical replication on your PlanetScale for Postgres cluster, configure these parameters in the Clusters > Parameters tab:
ParameterRequired ValueDescription
wal_levellogicalSetting the wal_level to logical enables logical replication, which captures row-level changes in a format that can be flexibly replayed on target systems.
max_replication_slots2 x replicasSet max_replication_slots to twice the number of replicas or subscribers. Each replica uses one slot, with the extra slots reserved for operations like failover.
max_wal_senders2 x replicasLikewise, set max_wal_senders to twice the number of replicas or targets, and not less than max_replication_slots.
max_slot_wal_keep_size> 4GBThe value of max_slot_wal_keep_size should be tuned to ensure you are keeping WAL files long enough for subscribers to consume them, while being sure that the source disk is not overrun by files. A reasonable starting point is > 4GB, and monitor your replication lag, your database’s change rate (inserts, updates, deletes), and available disk space.
In addition, for production environments, configure the following to ensure that your CDC stream is maintained during any switchover or failover. Without these settings, manual intervention will be required to restore data pipelines after these events.
ParameterRequired ValueDescription
sync_replication_slotsonSet to on to enable synchronization of replication slots to the subcribers.
hot_standby_feedbackonSet to on to prevent query conflicts during replication.
Also, in the Cluster configuration > Parameters tab of the dashboard UI, under the Failover section, add a comma-delimited list of the replication slot(s) you will create to preserve during any switchover or failover events. Be sure to apply the queue of configuration changes before proceeding.

Verify configuration

After setting these configuration parameters in the dashboard, you can verify them in the CLI. For example, to verify the WAL level:
SHOW wal_level;
The result should show logical:
 wal_level
-----------
 logical

CDC tool configuration

Ensure your CDC tool is configured properly:
  • Airbyte: Ensure replication slots are created with failover support (Setup Guide)
  • AWS DMS: Manually create failover-enabled replication slots before configuring DMS (Setup Guide)
  • ClickHouse: See ClickPipes documentation for PlanetScale configuration (Setup Guide)
  • Debezium: Configure connector to use failover-enabled replication slots (Setup Guide)
  • Fivetran: Create your own replication slot with failover = true (Setup Guide)
Some CDC tools create replication slots automatically. You must verify that any auto-created slots have failover = true enabled, or manually create the slots yourself with the proper configuration.

Create and manage users

For production CDC deployments, login as the default user and create a dedicated replication user with minimal privileges:
-- Create dedicated CDC user
CREATE USER cdc_user WITH REPLICATION PASSWORD 'strong_password';
The WITH REPLICATION clause allows the user to connect to the server using the replication protocol, to create and user replication slots, to stream WAL files, and to perform the logical decoding operations. You will configure this user to connect from your subscriber/consumer side. Because of the edge connection settings, to login as this user, add the branch ID after the username, like this:
cdc_user.12345678
Some target systems, like Fivetran, will need to match on the exact login username for some operations after it has logged in. As a workaround for those cases, also create a user with the name + branch ID, like this:
-- Create dedicated CDC user
CREATE USER "cdc_user.12345678" WITH REPLICATION PASSWORD 'strong_password';
To find the branch ID, look at the Settings > Roles page and observe the branch ID in roles that were created in the UI.

Create and manage replication streams

Create a replication slot

Using the dedicated replication role, create logical replication slots with the failover option enabled to preserve the slots during any switchover or failover events:
SELECT pg_create_logical_replication_slot(
  'my_cdc_slot',            -- slot_name
  'pgoutput',               -- plugin
  false,                    -- temporary
  false,                    -- two_phase
  true                      -- failover = true (REQUIRED)
);

Create initial publication

Some CDC tools require you to create publications to specify which tables to replicate. You will need to do this as the owner of the tables or the superuser. This example uses the default PlanetScale superuser.
CREATE PUBLICATION my_cdc_publication FOR TABLE table1, table2;

Add tables to a publication

Currently, tables must be added to the publication individually or as a comma-delimited list. Remember to update your publication when adding new tables that should be replicated.
ALTER PUBLICATION my_cdc_publication ADD TABLE table3;

Replica identity configuration

For complete change tracking of both row values before and after changes (as well as to support any tables without a primary key), set the replica identity to FULL:
-- Enable full replica identity for complete change tracking
ALTER TABLE table1 REPLICA IDENTITY FULL;
ALTER TABLE table2 REPLICA IDENTITY FULL;
ALTER TABLE table3 REPLICA IDENTITY FULL;

Verify publications

Issue the following to see active publications with tables. Do this as the default user.
SELECT p.pubname,
       c.relname AS tablename
FROM pg_publication p
JOIN pg_publication_rel pr ON p.oid = pr.prpubid
JOIN pg_class c ON pr.prrelid = c.oid;

Monitoring and troubleshooting

PlanetScale metrics for CDC monitoring

PlanetScale provides built-in metrics that are essential for monitoring your CDC setup. Access these through your Metrics dashboard to track replication health and performance:
Metric CategoryKey Indicators for CDCWhat to Monitor
WAL archival rateSuccess/Failed countsMonitor failed WAL archival attempts that could impact CDC streams
WAL archive ageSeconds behindAge of oldest unarchived WAL - should be under 60 seconds for healthy CDC
WAL storageStorage usage in MBTrack WAL disk usage; high usage may indicate CDC consumers falling behind
Replication lagLag in secondsMonitor delay between primary and replicas; high lag may indicate CDC consumer performance issues
Transaction rateTransactions per secondTrack database workload intensity affecting CDC processing
MemoryRSS and Memory mappedMonitor memory pressure that could impact logical decoding performance
Primary Storage UsageMB disk utilizationMonitor disk utilization to be sure WAL files are being consumed quickly enough
For detailed information about interpreting these metrics, see the Cluster Metrics documentation.

Monitoring replication lag

Check replication slot lag. The replication_lag column shows how much WAL data the publisher is keeping because the subscriber has not confirmed or processed it yet. This value should be kept well below max_wal_size.
SELECT
    slot_name,
    database,
    active,
    restart_lsn,
    confirmed_flush_lsn,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS replication_lag
FROM pg_replication_slots
WHERE slot_type = 'logical';

WAL retention and disk usage

Monitor WAL retention to prevent disk space issues. This is another way to see similar information, and will include any PlanetScale HA replicas.
SELECT
    slot_name,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots;

Common issues

Issue: WAL disk space growing rapidly
Cause: Inactive or slow CDC consumers
Solution: Remove unused slots or troubleshoot slow consumers
Issue: Failover breaks CDC stream
Cause: Replication slot not properly synchronized
Solution: Verify failover configuration and slot synchronization status

Best practices

  1. Always enable failover: Never deploy CDC to production without failover = true on replication slots and proper PlanetScale cluster configuration
  2. Verify configuration: Double-check that both your CDC tool and PlanetScale settings are properly configured before going live
  3. Test failover scenarios: Test actual failover events in staging environments to ensure your configuration works
  4. Regular monitoring: Monitor replication lag, WAL retention, and slot synchronization status
  5. Slot cleanup: Remove unused logical replication slots to prevent WAL accumulation
  6. CDC client resilience: Ensure CDC clients can handle connection interruptions gracefully

Security considerations

  • Logical replication exposes table data - ensure proper access controls
  • Use dedicated database users with minimal required privileges for CDC
  • Consider network security when streaming to external systems
  • Monitor for unauthorized replication slots
For more information about cluster configuration parameters, see the Cluster configuration parameters documentation.