MOLT Replicator continuously replicates changes from a source database to CockroachDB as part of a database migration. It supports migrations from a source database to CockroachDB with minimal downtime, and enables backfill from CockroachDB to your source database for failback scenarios to preserve a rollback option during a migration window.
MOLT Replicator consumes change data from PostgreSQL logical replication streams, MySQL GTID-based replication, Oracle LogMiner, and CockroachDB changefeeds (for failback). For details, refer to How it works.
Terminology
- Checkpoint: The position in the source database's transaction log from which replication begins or resumes: LSN (PostgreSQL), GTID (MySQL), or SCN (Oracle).
- Staging database: A CockroachDB database used by Replicator to store replication metadata, checkpoints, and buffered mutations. Specified with
--stagingSchemaand automatically created with--stagingCreateSchema. For details, refer to Staging database. - Forward replication: Replicate changes from a source database (PostgreSQL, MySQL, or Oracle) to CockroachDB during a migration. For usage details, refer to Forward replication with initial load.
- Failback: Replicate changes from CockroachDB back to the source database. Used for migration rollback or to maintain data consistency on the source during migration. For usage details, refer to Failback to source database.
Prerequisites
Supported databases
MOLT Replicator supports the following source and target databases:
- PostgreSQL 11-16
- MySQL 5.7, 8.0 and later
- Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
- CockroachDB (all currently supported versions)
Database configuration
The source database must be configured for replication:
| Database | Configuration Requirements | Details |
|---|---|---|
| PostgreSQL source |
|
Configure PostgreSQL for replication |
| MySQL source |
|
Configure MySQL for replication |
| Oracle source |
|
Configure Oracle for replication |
| CockroachDB source (failback) |
|
Configure CockroachDB for replication |
User permissions
The SQL user running MOLT Replicator requires specific privileges on both the source and target databases:
| Database | Required Privileges | Details |
|---|---|---|
| PostgreSQL source |
|
Create PostgreSQL migration user |
| MySQL source |
|
Create MySQL migration user |
| Oracle source |
|
Create Oracle migration user Create sentinel table Grant LogMiner privileges |
| CockroachDB target (forward replication) |
|
Create CockroachDB user |
| PostgreSQL, MySQL, or Oracle target (failback) |
|
Grant PostgreSQL user permissions Grant MySQL user permissions Grant Oracle user permissions |
Installation
To install MOLT, download the binary that matches your architecture and source database:
| Operating System | Architecture | PostgreSQL/MySQL | Oracle (Preview) |
|---|---|---|---|
| Windows | AMD 64-bit | Download | N/A |
| ARM 64-bit | Download | N/A | |
| Linux | AMD 64-bit | Download | Download |
| ARM 64-bit | Download | N/A | |
| Mac | AMD 64-bit | Download | N/A |
| ARM 64-bit | Download | N/A |
The following binaries are included:
moltreplicator
For ease of use, keep both molt and replicator in your current working directory.
To display the current version of each binary, run molt --version and replicator --version.
For previous binaries, refer to the MOLT version manifest. For release details, refer to the MOLT changelog.
Docker images
MOLT Replicator
Docker images for MOLT Replicator are also available as a standalone binary:
docker pull cockroachdb/replicator
To pull a specific version (for example, v1.1.1):
docker pull cockroachdb/replicator:v1.1.1
Docker usage
Performance
MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.
Local connection strings
When testing locally, specify the host as follows:
For macOS, use
host.docker.internal. For example:--sourceConn 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable' --targetConn "postgres://root@host.docker.internal:26257/molt?sslmode=disable"For Linux and Windows, use
172.17.0.1. For example:--sourceConn 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable' --targetConn "postgres://root@172.17.0.1:26257/molt?sslmode=disable"
How it works
MOLT Replicator supports forward replication from PostgreSQL, MySQL, and Oracle, and failback from CockroachDB:
PostgreSQL source (
pglogical): MOLT Replicator uses PostgreSQL logical replication, which is based on publications and replication slots. You create a publication for the target tables, and a slot marks consistent replication points. MOLT Replicator consumes this logical feed directly and applies the data in sorted batches to the target.MySQL source (
mylogical): MOLT Replicator relies on MySQL GTID-based replication to read change data from MySQL binlogs. It works with MySQL versions that support GTID-based replication and applies transactionally consistent feeds to the target. Binlog features that do not use GTIDs are not supported.Oracle source (
oraclelogminer): MOLT Replicator uses Oracle LogMiner to capture change data from Oracle redo logs. Both Oracle Multitenant (CDB/PDB) and single-tenant Oracle architectures are supported. Replicator periodically queries LogMiner-populated views and processes transactional data in ascending SCN windows for reliable throughput while maintaining consistency.Failback from CockroachDB (
start): MOLT Replicator acts as an HTTP webhook sink for a single CockroachDB changefeed. Replicator receives mutations from source cluster nodes, can optionally buffer them in a CockroachDB staging cluster, and then applies time-ordered transactional batches to the target database. Mutations are applied asUPSERTorDELETEstatements while respecting foreign-key and table dependencies.
Consistency modes
MOLT Replicator supports three consistency modes for balancing throughput and transactional guarantees:
Consistent (failback mode only, default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by
--parallelism.BestEffort (failback mode only): Relaxes atomicity across tables that do not have foreign key constraints between them (maintains coherence within FK-connected groups). Enable with
--bestEffortOnlyor allow auto-entry via--bestEffortWindowset to a positive duration (such as1s).Note:For independent tables (with no foreign key constraints), BestEffort mode applies changes immediately as they arrive, without waiting for the resolved timestamp. This provides higher throughput for tables that have no relationships with other tables.
Immediate (default for PostgreSQL, MySQL, and Oracle sources): Applies updates as they arrive to Replicator with no buffering or waiting for resolved timestamps. For CockroachDB sources, provides highest throughput but requires no foreign keys on the target schema.
Commands
MOLT Replicator provides the following commands:
| Command | Description |
|---|---|
pglogical |
Replicate from PostgreSQL source to CockroachDB target using logical replication. |
mylogical |
Replicate from MySQL source to CockroachDB target using GTID-based replication. |
oraclelogminer |
Replicate from Oracle source to CockroachDB target using Oracle LogMiner. |
start |
Replicate from CockroachDB source to PostgreSQL, MySQL, or Oracle target (failback mode). Requires a CockroachDB changefeed with rangefeeds enabled. |
make-jwt |
Generate JWT tokens for authorizing changefeed connections in failback scenarios. Supports signing tokens with RSA or EC keys, or generating claims for external JWT providers. For details, refer to JWT authentication. |
version |
Display version information and Go module dependencies with checksums. For details, refer to Supply chain security. |
For command-specific flags and examples, refer to Usage and Common workflows.
Flags
Global flags
| Flag | Type | Description |
|---|---|---|
--applyTimeout |
DURATION |
The maximum amount of time to wait for an update to be applied. Default: 30s |
--dlqTableName |
IDENT |
The name of a table in the target schema for storing dead-letter entries. Default: replicator_dlq |
--enableParallelApplies |
BOOL |
Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage. Default: false |
--flushPeriod |
DURATION |
Flush queued mutations after this duration. Default: 1s |
--flushSize |
INT |
Ideal batch size to determine when to flush mutations. Default: 1000 |
--gracePeriod |
DURATION |
Allow background processes to exit. Default: 30s |
--logDestination |
STRING |
Write logs to a file. If not specified, write logs to stdout. |
--logFormat |
STRING |
Choose log output format: "fluent", "text".Default: "text" |
--maxRetries |
INT |
Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure. Default: 10 |
--metricsAddr |
STRING |
A host:port on which to serve metrics and diagnostics. The metrics endpoint is http://{host}:{port}/_/varz. |
--parallelism |
INT |
The number of concurrent database transactions to use. Default: 16 |
--quiescentPeriod |
DURATION |
How often to retry deferred mutations. Default: 10s |
--retireOffset |
DURATION |
How long to delay removal of applied mutations. Default: 24h0m0s |
--retryInitialBackoff |
DURATION |
Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation. Default: 25ms |
--retryMaxBackoff |
DURATION |
Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors. Default: 2s |
--retryMultiplier |
INT |
Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database. Default: 2 |
--scanSize |
INT |
The number of rows to retrieve from the staging database used to store metadata for replication. Default: 10000 |
--schemaRefresh |
DURATION |
How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled. Default: 1m0s |
--sourceConn |
STRING |
The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to Oracle replication flags. |
--stageDisableCreateTableReaderIndex |
BOOL |
Disable the creation of partial covering indexes to improve read performance on staging tables. Set to true if creating indexes on existing tables would cause a significant operational impact.Default: false |
--stageMarkAppliedLimit |
INT |
Limit the number of mutations to be marked applied in a single statement. Default: 100000 |
--stageSanityCheckPeriod |
DURATION |
How often to validate staging table apply order (-1 to disable).Default: 10m0s |
--stageSanityCheckWindow |
DURATION |
How far back to look when validating staging table apply order. Default: 1h0m0s |
--stageUnappliedPeriod |
DURATION |
How often to report the number of unapplied mutations in staging tables (-1 to disable).Default: 1m0s |
--stagingConn |
STRING |
The staging database's connection string. |
--stagingCreateSchema |
Automatically create the staging schema if it does not exist. | |
--stagingIdleTime |
DURATION |
Maximum lifetime of an idle connection. Default: 1m0s |
--stagingJitterTime |
DURATION |
The time over which to jitter database pool disconnections. Default: 15s |
--stagingMaxLifetime |
DURATION |
The maximum lifetime of a database connection. Default: 5m0s |
--stagingMaxPoolSize |
INT |
The maximum number of staging database connections. Default: 128 |
--stagingSchema |
STRING |
Name of the CockroachDB schema that stores replication metadata. Required each time replicator is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction.Default: _replicator.public |
--targetApplyQueueSize |
INT |
Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (pglogical) sources, and replaces the deprecated --copierChannel and --stageCopierChannelSize flags. |
--targetConn |
STRING |
The target database's connection string. |
--targetIdleTime |
DURATION |
Maximum lifetime of an idle connection. Default: 1m0s |
--targetJitterTime |
DURATION |
The time over which to jitter database pool disconnections. Default: 15s |
--targetMaxLifetime |
DURATION |
The maximum lifetime of a database connection. Default: 5m0s |
--targetMaxPoolSize |
INT |
The maximum number of target database connections. Default: 128 |
--targetSchema |
STRING |
The SQL database schema in the target cluster to update. |
--targetStatementCacheSize |
INT |
The maximum number of prepared statements to retain. Default: 128 |
--taskGracePeriod |
DURATION |
How long to allow for task cleanup when recovering from errors. Default: 1m0s |
--timestampLimit |
INT |
The maximum number of source timestamps to coalesce into a target transaction. Default: 1000 |
--userscript |
STRING |
The path to a TypeScript configuration script. For example, --userscript 'script.ts'. |
-v, --verbose |
COUNT |
Increase logging verbosity. Use -v for debug logging or -vv for trace logging. |
pglogical replication flags
The following flags are used when replicating from a PostgreSQL source database.
| Flag | Type | Description |
|---|---|---|
--publicationName |
STRING |
The publication within the source database to replicate. |
--slotName |
STRING |
The replication slot in the source database. Default: "replicator" |
--standbyTimeout |
DURATION |
How often to report WAL progress to the source server. Default: 5s |
mylogical replication flags
The following flags are used when replicating from a MySQL source database.
| Flag | Type | Description |
|---|---|---|
--defaultGTIDSet |
STRING |
Default GTID set, in the format source_uuid:min(interval_start)-max(interval_end). Required the first time replicator is run, as the GTID set provides a replication marker for streaming changes. |
--fetchMetadata |
Fetch column metadata explicitly, for older versions of MySQL that do not support binlog_row_metadata. |
|
--replicationProcessID |
UINT32 |
The replication process ID to report to the source database. Default: 10 |
oraclelogminer replication flags
The following flags are used when replicating from an Oracle source database.
| Flag | Type | Description |
|---|---|---|
--sourceSchema |
STRING |
Required. Source schema name on Oracle where tables will be replicated from. |
--scn |
INT |
The snapshot System Change Number (SCN) from the initial data load. Required the first time replicator is run, as the SCN provides a replication marker for streaming changes. |
--backfillFromSCN |
INT |
The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. |
--sourcePDBConn |
STRING |
Connection string for the Oracle pluggable database (PDB). Only required when using an Oracle multitenant configuration. --sourceConn must be included. |
--oracle-application-users |
STRING |
List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., PDB_USER). |
start failback flags
The following flags are used for failback from CockroachDB.
| Flag | Type | Description |
|---|---|---|
--assumeIdempotent |
Disable the extra staging table queries that debounce non-idempotent redelivery in changefeeds. | |
--bestEffortOnly |
Eventually-consistent mode; useful for high-throughput, skew-tolerant schemas with foreign keys. | |
--bestEffortWindow |
DURATION |
Use an eventually-consistent mode for initial backfill or when replication is behind; 0 to disable.Default: 1h0m0s |
--bindAddr |
STRING |
The network address to bind to. Default: ":26258" |
--disableAuthentication |
Disable authentication of incoming Replicator requests; not recommended for production. | |
--enableCheckpointStream |
Enable checkpoint streaming (use an internal changefeed from the staging table for real-time updates), rather than checkpoint polling (query the staging table for periodic updates), for failback replication. Default: false (use checkpoint polling) |
|
--discard |
Dangerous: Discard all incoming HTTP requests; useful for changefeed throughput testing. Not intended for production. | |
--discardDelay |
DURATION |
Adds additional delay in discard mode; useful for gauging the impact of changefeed round-trip time (RTT). |
--healthCheckTimeout |
DURATION |
The timeout for the health check endpoint. Default: 5s |
--httpResponseTimeout |
DURATION |
The maximum amount of time to allow an HTTP handler to execute. Default: 2m0s |
--immediate |
Bypass staging tables and write directly to target; recommended only for KV-style workloads with no foreign keys. | |
--limitLookahead |
INT |
Limit number of checkpoints to be considered when computing the resolving range; may cause replication to stall completely if older mutations cannot be applied. |
--ndjsonBufferSize |
INT |
The maximum amount of data to buffer while reading a single line of ndjson input; increase when source cluster has large blob values.Default: 65536 |
--tlsCertificate |
STRING |
A path to a PEM-encoded TLS certificate chain. |
--tlsPrivateKey |
STRING |
A path to a PEM-encoded TLS private key. |
--tlsSelfSigned |
If true, generate a self-signed TLS certificate valid for localhost. |
make-jwt flags
The following flags are used with the make-jwt command to generate JWT tokens for changefeed authentication.
| Flag | Type | Description |
|---|---|---|
-a, --allow |
STRING |
One or more database.schema identifiers. Can be repeated for multiple schemas. |
--claim |
If true, print a minimal JWT claim instead of signing. |
|
-k, --key |
STRING |
The path to a PEM-encoded private key to sign the token with. |
-o, --out |
STRING |
A file to write the token to. |
Usage
Replicator commands
MOLT Replicator provides four commands for different replication scenarios. For detailed workflows, refer to Common workflows.
Use pglogical to replicate from PostgreSQL to CockroachDB:
replicator pglogical
Use mylogical to replicate from MySQL to CockroachDB:
replicator mylogical
Use oraclelogminer to replicate from Oracle to CockroachDB:
replicator oraclelogminer
Use start to replicate from CockroachDB to PostgreSQL, MySQL, or Oracle (failback):
replicator start
Source connection strings
Follow the security recommendations in Connection security and credentials.
--sourceConn specifies the connection string of the source database for forward replication.
The source connection string must point to the primary instance of the source database. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.
PostgreSQL connection string:
--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
MySQL connection string:
--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
Oracle connection string:
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'
For Oracle Multitenant databases, --sourcePDBConn specifies the pluggable database (PDB) connection. --sourceConn specifies the container database (CDB):
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{cdb_service_name}'
--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'
For failback, --stagingConn specifies the CockroachDB connection string:
--stagingConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
Target connection strings
--targetConn specifies the connection string of the target CockroachDB database for forward replication:
--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
For failback, --targetConn specifies the original source database (PostgreSQL, MySQL, or Oracle). For details, refer to Failback to source database.
Replication checkpoints
MOLT Replicator requires a checkpoint value to start replication from the correct position in the source database's transaction log.
For PostgreSQL, use --slotName to specify the replication slot created during the data load. The slot automatically tracks the LSN (Log Sequence Number):
--slotName molt_slot
For MySQL, use --defaultGTIDSet with the GTID set from the MOLT Fetch output:
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'
For Oracle, use --scn and --backfillFromSCN with the SCN values from the MOLT Fetch output:
--scn 26685786
--backfillFromSCN 26685444
Staging database
The staging database stores replication metadata, checkpoints, and buffered mutations. Specify the staging database with --stagingSchema and create it automatically with --stagingCreateSchema:
--stagingSchema _replicator
--stagingCreateSchema
The staging database is used to:
- Store checkpoints that enable resuming from the correct point after interruptions.
- Buffer mutations before applying them to the target in transaction order.
- Maintain consistency for time-ordered transactional batches while respecting table dependencies.
- Provide restart capabilities after failures.
Security
Cockroach Labs strongly recommends the following:
Connection security and credentials
To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
- Avoid plaintext connection strings.
Provide your connection strings as environment variables. For example:
export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full" export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"Afterward, reference the environment variables in MOLT commands:
--sourceConn $SOURCE --targetConn $TARGETIf possible, use an external secrets manager to load the environment variables from stored secrets.
Use TLS-enabled connection strings to encrypt data in transit from MOLT to the database. When using TLS certificates, ensure certificate files are accessible to the MOLT binary on the same machine.
For example, a PostgreSQL connection string with TLS certificates:
postgresql://migration_user@db.example.com:5432/appdb?sslmode=verify-full&sslrootcert=/etc/molt/certs/ca.pem&sslcert=/etc/molt/certs/client.crt&sslkey=/etc/molt/certs/client.keyURL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.
Given a password
a$52&, pass it to themolt escape-passwordcommand with single quotes:molt escape-password --password 'a$52&'Use the encoded password in your connection string. For example:
postgres://migration_user:a%2452%26@localhost:5432/replicationload
Remove
sslmode=disablefrom production connection strings.
CockroachDB changefeed security
For failback scenarios, secure the connection from CockroachDB to MOLT Replicator using TLS certificates. Generate TLS certificates using self-signed certificates, certificate authorities like Let's Encrypt, or your organization's certificate management system.
TLS from CockroachDB to Replicator
Configure MOLT Replicator with server certificates using the --tlsCertificate and --tlsPrivateKey flags to specify the certificate and private key file paths. For example:
replicator start \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key \
...
These server certificates must correspond to the client certificates specified in the changefeed webhook URL to ensure proper TLS handshake.
Encode client certificates for changefeed webhook URLs:
- Webhook URLs: Use both URL encoding and base64 encoding:
base64 -i ./client.crt | jq -R -r '@uri' - Non-webhook contexts: Use base64 encoding only:
base64 -w 0 ca.cert
JWT authentication
You can use JSON Web Tokens (JWT) to authorize incoming changefeed connections and restrict writes to a subset of SQL databases or user-defined schemas in the target cluster.
Replicator supports JWT claims that allow writes to specific databases, schemas, or all of them. JWT tokens must be signed using RSA or EC keys. HMAC and None signatures are automatically rejected.
To configure JWT authentication:
Add PEM-formatted public signing keys to the
_replicator.jwt_public_keystable in the staging database.To revoke a specific token, add its
jtivalue to the_replicator.jwt_revoked_idstable in the staging database.
The Replicator process re-reads these tables every minute to pick up changes.
To pass the JWT token from the changefeed to the Replicator webhook sink, use the webhook_auth_header option:
CREATE CHANGEFEED ... WITH webhook_auth_header='Bearer <encoded_token>';
Token quickstart
The following example uses OpenSSL to generate keys, but any PEM-encoded RSA or EC keys will work.
# Generate an EC private key using OpenSSL.
openssl ecparam -out ec.key -genkey -name prime256v1
# Write the public key components to a separate file.
openssl ec -in ec.key -pubout -out ec.pub
# Upload the public key for all instances of Replicator to find it.
cockroach sql -e "INSERT INTO _replicator.jwt_public_keys (public_key) VALUES ('$(cat ec.pub)')"
# Reload configuration, or wait one minute.
killall -HUP replicator
# Generate a token which can write to the ycsb.public schema.
# The key can be decoded using the debugger at https://jwt.io.
# Add the contents of out.jwt to the CREATE CHANGEFEED command:
# WITH webhook_auth_header='Bearer {out.jwt}'
replicator make-jwt -k ec.key -a ycsb.public -o out.jwt
External JWT providers
The make-jwt command also supports a --claim flag, which prints a JWT claim that can be signed by your existing JWT provider. The PEM-formatted public key or keys for that provider must be inserted into the _replicator.jwt_public_keys table. The iss (issuers) and jti (token id) fields will likely be specific to your auth provider, but the custom claim must be retained in its entirety.
You can repeat the -a flag to create a claim for multiple schemas:
replicator make-jwt -a 'database.schema' --claim
{
"iss": "replicator",
"jti": "d5ffa211-8d54-424b-819a-bc19af9202a5",
"https://github.com/cockroachdb/replicator": {
"schemas": [
[
"database",
"schema"
]
]
}
}
For details on the make-jwt command flags, refer to make-jwt flags.
Production considerations
- Avoid
--disableAuthenticationand--tlsSelfSignedflags in production environments. These flags should only be used for testing or development purposes.
Supply chain security
Use the version command to verify the integrity of your MOLT Replicator build and identify potential upstream vulnerabilities.
replicator version
The output includes:
- Module name
- go.mod checksum
- Version
Use this information to determine if your build may be subject to vulnerabilities from upstream packages. Cockroach Labs uses Dependabot to automatically upgrade Go modules, and the team regularly merges Dependabot updates to address security issues.
Common workflows
Forward replication with initial load
To start replication after an initial data load with MOLT Fetch, use the pglogical command:
replicator pglogical
To start replication after an initial data load with MOLT Fetch, use the mylogical command:
replicator mylogical
To start replication after an initial data load with MOLT Fetch, use the oraclelogminer command:
replicator oraclelogminer
Specify the source and target database connections. For connection string formats, refer to Source connection strings and Target connection strings:
--sourceConn $SOURCE
--targetConn $TARGET
For Oracle Multitenant databases, also specify the PDB connection:
--sourcePDBConn $SOURCE_PDB
Specify the source Oracle schema to replicate from:
--sourceSchema migration_schema
To replicate from the correct position, specify the appropriate checkpoint value.
Use --slotName to specify the slot created during the data load, which automatically tracks the LSN (Log Sequence Number) checkpoint:
--slotName molt_slot
Use --defaultGTIDSet from the cdc_cursor field in the MOLT Fetch output:
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'
Use the --scn and --backfillFromSCN values from the MOLT Fetch output:
--scn 26685786
--backfillFromSCN 26685444
Use --stagingSchema to specify the staging database. Use --stagingCreateSchema to create it automatically on first run:
--stagingSchema _replicator
--stagingCreateSchema
At minimum, the replicator command should include the following flags:
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--slotName molt_slot \
--stagingSchema _replicator \
--stagingCreateSchema
For detailed steps, refer to Load and replicate.
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29' \
--stagingSchema _replicator \
--stagingCreateSchema
For detailed steps, refer to Load and replicate.
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema migration_schema \
--targetConn $TARGET \
--scn 26685786 \
--backfillFromSCN 26685444 \
--stagingSchema _replicator \
--stagingCreateSchema
For detailed steps, refer to Load and replicate.
Resume after interruption
When resuming replication after an interruption, MOLT Replicator automatically uses the stored checkpoint to resume from the correct position.
Rerun the same replicator command used during forward replication, specifying the same --stagingSchema value as before. Omit --stagingCreateSchema and any checkpoint flags. For example:
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--slotName molt_slot \
--stagingSchema _replicator
For detailed steps, refer to Resume replication.
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--stagingSchema _replicator
For detailed steps, refer to Resume replication.
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema migration_schema \
--targetConn $TARGET \
--stagingSchema _replicator
For detailed steps, refer to Resume replication.
Failback to source database
When replicating from CockroachDB back to the source database, MOLT Replicator acts as a webhook sink for a CockroachDB changefeed.
Use the start command for failback:
replicator start
Specify the target database connection (the database you originally migrated from). For connection string formats, refer to Target connection strings:
--targetConn $TARGET
Specify the CockroachDB connection string. For details, refer to Connect using a URL.
--stagingConn $STAGING
Specify the staging database name. This should be the same staging database created during Forward replication with initial load:
--stagingSchema _replicator
Specify a webhook endpoint address for the changefeed to send changes to. For example:
--bindAddr :30004
Specify TLS certificate and private key file paths for secure webhook connections:
--tlsCertificate ./certs/server.crt
--tlsPrivateKey ./certs/server.key
At minimum, the replicator command should include the following flags:
replicator start \
--targetConn $TARGET \
--stagingConn $STAGING \
--stagingSchema _replicator \
--bindAddr :30004 \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key
For detailed steps, refer to Migration failback.
Monitoring
Metrics
MOLT Replicator can export Prometheus metrics by setting the --metricsAddr flag to a port (for example, --metricsAddr :30005). Metrics are not enabled by default. When enabled, metrics are available at the path /_/varz. For example: http://localhost:30005/_/varz.
For a list of recommended metrics to monitor during replication, refer to:
- Forward replication metrics (PostgreSQL, MySQL, and Oracle sources)
- Failback replication metrics (CockroachDB source)
You can use the Replicator Grafana dashboard to visualize the metrics. For Oracle-specific metrics, import the Oracle Grafana dashboard.
To check MOLT Replicator health when metrics are enabled, run curl http://localhost:30005/_/healthz (replacing the port with your --metricsAddr value). This returns a status code of 200 if Replicator is running.
Logging
By default, MOLT Replicator writes two streams of logs: operational logs to stdout (including warning, info, trace, and some errors) and final errors to stderr.
Redirect both streams to ensure all logs are captured for troubleshooting:
# Merge both streams to console
./replicator ... 2>&1
# Redirect both streams to a file
./replicator ... > output.log 2>&1
# Merge streams to console while saving to file
./replicator > >(tee replicator.log) 2>&1
# Use logDestination flag to write all logs to a file
./replicator --logDestination replicator.log ...
Enable debug logging with -v. For more granularity and system insights, enable trace logging with -vv. Pay close attention to warning- and error-level logs, as these indicate when Replicator is misbehaving.
Best practices
Test and validate
To verify that your connections and configuration work properly, run MOLT Replicator in a staging environment before replicating any data in production. Use a test or development environment that closely resembles production.
Optimize performance
Configure the following replicator flags to optimize replication throughput and resource usage. Test different combinations in a pre-production environment to find the optimal balance of stability and performance for your workload.
The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.
| Flag | Description |
|---|---|
--parallelism |
Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity. |
--flushSize |
Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency. |
--targetApplyQueueSize |
Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption. |
--targetMaxPoolSize |
Set larger than --parallelism by a safety factor to avoid exhausting target pool connections. Replicator enforces setting parallelism to 80% of this value. |
--collapseMutations |
Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state. |
--enableParallelApplies |
Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase --targetMaxPoolSize or reduce --parallelism. |
--flushPeriod |
Set to the maximum allowable time between flushes (for example, 10s if data must be applied within 10 seconds). Works with --flushSize to control when buffered mutations are committed to the target. |
--quiescentPeriod |
Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve. |
--scanSize |
Applies to failback (replicator start) scenarios only. Balance memory usage and throughput. Increase to read more rows at once from the CockroachDB staging cluster for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability. |
Troubleshooting
Forward replication issues
Performance troubleshooting
If MOLT Replicator appears hung or performs poorly:
Enable trace logging with
-vvto get more visibility into the replicator's state and behavior.If MOLT Replicator is in an unknown, hung, or erroneous state, collect performance profiles to include with support tickets. Replace
{host}and{metrics-port}with your Replicator host and the port specified by--metricsAddr:curl '{host}:{metrics-port}/debug/pprof/trace?seconds=15' > trace.out curl '{host}:{metrics-port}/debug/pprof/profile?seconds=15' > profile.out curl '{host}:{metrics-port}/debug/pprof/goroutine?seconds=15' > gr.out curl '{host}:{metrics-port}/debug/pprof/heap?seconds=15' > heap.outMonitor lag metrics and adjust performance parameters as needed.
Unable to create publication or slot
This error occurs when logical replication is not supported.
Resolution: If you are connected to a replica, connect to the primary instance instead. Replicas cannot create or manage logical replication slots or publications.
Verify that the source database supports logical replication by checking the wal_level parameter on PostgreSQL:
SHOW wal_level;
If wal_level is not set to logical, update it and restart PostgreSQL:
ALTER SYSTEM SET wal_level = 'logical';
Replication slot already exists
ERROR: replication slot "molt_slot" already exists
Resolution: Either create a new slot with a different name, or drop the existing slot to start fresh:
SELECT pg_drop_replication_slot('molt_slot');
Dropping a replication slot can be destructive and delete data that is not yet replicated. Only use this if you want to restart replication from the current position.
Publication does not exist
run CREATE PUBLICATION molt_fetch FOR ALL TABLES;
Resolution: Create the publication on the source database. Ensure you also create the replication slot:
CREATE PUBLICATION molt_publication FOR ALL TABLES;
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
Could not connect to PostgreSQL
could not connect to source database: failed to connect to `user=migration_user database=source_database`
Resolution: Verify the connection details including user, host, port, and database name. Ensure the database name in your --sourceConn connection string matches exactly where you created the publication and slot. Verify you're connecting to the same host and port where you ran the CREATE PUBLICATION and SELECT pg_create_logical_replication_slot() commands. Check if TLS certificates need to be included in the connection URI.
Wrong replication slot name
run SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput'); in source database
Resolution: Create the replication slot or verify the correct slot name:
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
Repeated binlog syncing restarts
If Replicator repeatedly restarts binlog syncing or starts replication from an unexpectedly old location, this indicates an invalid or purged GTID. When an invalid GTID is provided, the binlog syncer will fall back to the first valid GTID.
Resolution: Verify the GTID set is valid and not purged:
-- Check if GTID is in executed set
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_executed) AS in_executed;
-- Check if GTID is purged
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_purged) AS in_purged;
Interpret the results as follows:
- If
in_executedreturns1andin_purgedreturns0, the GTID is valid for replication. - If
in_purgedreturns1, the GTID has been purged and you must find a newer consistent point. - If both return
0, the GTID doesn't exist in the records and is invalid.
If the GTID is purged or invalid, follow these steps:
Increase binlog retention by configuring
binlog_expire_logs_secondsin MySQL:-- Increase binlog retention (example: 7 days = 604800 seconds) SET GLOBAL binlog_expire_logs_seconds = 604800;Note:For managed MySQL services (such as Amazon RDS, Google Cloud SQL, or Azure Database for MySQL), binlog retention is typically configured through the provider's console or CLI. Consult your provider's documentation for how to adjust binlog retention settings.
Get a current GTID set to restart replication:
-- For MySQL < 8.0: SHOW MASTER STATUS; -- For MySQL 8.0+: SHOW BINARY LOG STATUS;+---------------+----------+--------------+------------------+-------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +---------------+----------+--------------+------------------+-------------------------------------------+ | binlog.000005 | 197 | | | 77263736-7899-11f0-81a5-0242ac120002:1-38 | +---------------+----------+--------------+------------------+-------------------------------------------+Use the
Executed_Gtid_Setvalue for the--defaultGTIDSetflag.
Invalid GTID format
Invalid GTIDs can occur when GTIDs are purged due to insufficient binlog retention, when connecting to a replica instead of the primary host, or when passing a GTID that has valid format but doesn't exist in the binlog history.
Resolution: Use a valid GTID from SHOW MASTER STATUS (MySQL < 8.0) or SHOW BINARY LOG STATUS (MySQL 8.0+) and ensure you're connecting to the primary host. If GTIDs are being purged, increase binlog retention.
Table/column names exceed 30 characters
Oracle LogMiner excludes tables and columns with names longer than 30 characters from redo logs.
Resolution: Rename tables and columns to 30 characters or fewer before migration.
Unsupported data types
LogMiner and replication do not support:
- Long
BLOB/CLOBs (4000+ characters) - User-defined types (UDTs)
- Nested tables
- Varrays
GEOGRAPHYandGEOMETRY
Resolution: Convert unsupported data types or exclude affected tables from replication.
LOB column UPDATE statements
UPDATE statements that only modify LOB columns are not supported by Oracle LogMiner.
Resolution: Avoid LOB-only updates during replication, or use Binary Reader for Oracle 12c.
JSONB null handling
SQL NULL and JSON null values are not distinguishable in JSON payloads during replication.
Resolution: Avoid using nullable JSONB columns where the distinction between SQL NULL and JSON null is important.
Missing redo logs or unavailable SCN
If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.
Resolution: Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.
Replicator lag
If the replicator process is lagging significantly behind the current Oracle SCN, you may see log messages like: replicator is catching up to the current SCN at 5000 from 1000…. This indicates that replication is progressing but is still behind the most recent changes on the source database.
Schema drift errors
Indicates source and target schemas are mismatched:
WARNING: schema drift detected in "database"."table" at payload object offset 0: unexpected columns: column_name
Resolution: Align schemas or use userscripts to transform data.
Apply flow failures
Apply flow failures occur when the target database encounters error conditions such as unique constraint violations, target database being unavailable, or incorrect data (missing or extraneous columns) during apply operations:
WARNING: warning during tryCommit: ERROR: duplicate key value violates unique constraint
ERROR: maximum number of retries (10) exceeded
Resolution: Check target database constraints and connection stability. MOLT Replicator will log warnings for each retry attempt. If you see warnings but no final error, the apply succeeded after retrying. If all retry attempts are exhausted, Replicator will surface a final error and restart the apply loop to continue processing.
Failback issues
If the changefeed shows connection errors in SHOW CHANGEFEED JOB:
Connection refused
transient error: Post "https://replicator-host:30004/molt/public": dial tcp [::1]:30004: connect: connection refused
This indicates that Replicator is down, the webhook URL is incorrect, or the port is misconfigured.
Resolution: Verify that MOLT Replicator is running on the port specified in the changefeed INTO configuration. Confirm the host and port are correct.
Incorrect schema path errors
This error occurs when the CockroachDB changefeed webhook URL path does not match the target database schema naming convention:
transient error: 400 Bad Request: unknown schema:
The webhook URL path is specified in the INTO clause when you create the changefeed. For example: webhook-https://replicator-host:30004/database/schema.
Resolution: Verify the webhook path format matches your target database type:
- PostgreSQL or CockroachDB targets: Use
/database/schemaformat. For example,webhook-https://replicator-host:30004/migration_schema/public. - MySQL targets: Use
/databaseformat (schema is implicit). For example,webhook-https://replicator-host:30004/migration_schema. - Oracle targets: Use
/DATABASEformat in uppercase. For example,webhook-https://replicator-host:30004/MIGRATION_SCHEMA.
For details on configuring the webhook sink URI, refer to Webhook sink.
GC threshold error
batch timestamp * must be after replica GC threshold
This indicates starting from an invalid cursor that has been garbage collected.
Resolution: Double-check the cursor to ensure it represents a valid range that has not been garbage collected, or extend the GC TTL on the source CockroachDB cluster:
ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = {gc_ttl_in_seconds};
Duplicated data re-application
This occurs when resuming a changefeed from a cursor causes excessive data duplication.
Resolution: Clear the staging database to prevent duplication. This deletes all checkpoints and buffered data, so use with caution:
DROP DATABASE _replicator;
For more targeted cleanup, delete mutations from specific staging tables:
DELETE FROM _replicator.employees WHERE true;
Examples
For detailed examples of using MOLT Replicator usage, refer to the migration workflow tutorials:
- Load and Replicate: Load data with MOLT Fetch and set up ongoing replication with MOLT Replicator.
- Resume Replication: Resume replication after an interruption.
- Migration failback: Replicate changes from CockroachDB back to the initial source database.