A Django package to sync tables across environments using PostgreSQL logical replication.
This package enables PostgreSQL logical replication to sync specific tables across environments while supporting:
- Relationships between synced and non-synced tables
- Denormalization of non-synced tables triggered by synced tables
- Deletion propagation across environments
- Non-destructive migration of existing environments
- Overview
- Key Concepts
- Setup
- Continuous Deployment
- Delete Models
- Denormalize Models
- Advanced Publication Options
- Caveats
- Tests
- Non-Destructive Manual Resync
- Common Issues & Solutions
| Term | Description |
|---|---|
| Master Environment | Source of truth (IS_MASTER_ENV = True) |
| Slave Environment | Receives synced data (IS_MASTER_ENV = False) |
| User Model | Non-synced model (default) |
| System Model | Model synced across environments via @system_model |
| Delete Model | System model with managed deletions via @delete_model |
| Denormalize Model | Delete model with denormalization via @denormalize_model |
# requirements.txt
pip install git+https://github.com/selfdecode/django-logical-replication.git@v2.0.2Add to installed apps:
# settings.py
INSTALLED_APPS = [
"...",
"logical_replication",
]# settings.py
IS_MASTER_ENV = os.environ["IS_MASTER_ENV"] == "true"Either use a decorator:
# models.py
@system_model
class Unit(models.Model):
passOr register in settings:
# settings.py
ADDITIONAL_SYSTEM_MODELS = ["dummy_app.Unit"]
ADDITIONAL_DELETE_MODELS = ["dummy_app.Marker"]
ADDITIONAL_DENORMALIZE_MODELS = ["dummy_app.Outcome_sub_outcomes"] # many2manyHint: You can call update_publication --dry-run to get a list of all user models. Make sure nothing you want to sync is included in here.
This router controls migration on slave envs for synced django contrib models. It should be placed first.
# settings.py
DATABASE_ROUTERS = ["logical_replication.router.LogicalReplicationRouter"]These cron classes are required only if you're using delete models or denormalize models. They handle:
- Delete propagation across environments (for delete models)
- Denormalization updates (for denormalize models)
# settings.py
CRON_CLASSES = [
# ...
"logical_replication.cron.ProcessDeleteQueue", # Required for delete models
"logical_replication.cron.ProcessDenormalizeQueue", # Required for denormalize models
]Skip this step if you are only using system models without delete/denormalize functionality.
Dependencies: This feature requires the django-cron package. If you're using these cron classes, you'll need to:
- Add django-cron to
INSTALLED_APPS - Periodly run
python manage.py runcrons(see django-cron docs for more details)
Steps:
-
Setup tables on new slave env:
manage.py create_django_contrib_tables # must run BEFORE migrate manage.py migrate -
Create empty publication on master:
manage.py create_publication # if doesn't exist -
Create subscription on slave (details):
manage.py create_subscription
-
Update publication on master with tables to sync:
manage.py update_publication
-
Update subscription on slave:
manage.py update_subscription # This will trigger initial copy of tables!
See the dummy_app.tests.py for an example (steps 2-3 are already done in docker-compose).
The initial sync uses a simple COPY FROM so will fail if constraints are violated (e.g., primary key or unique).
Caveats:
- The database will be no-operation for a while
- Although user data will not be lost, newly missing FKs will be handled as though
on_deletewas called
Steps:
-
Create empty publication on master:
manage.py create_publication # if doesn't exist -
Create subscription on slave (details):
manage.py create_subscription
-
Build delete sql:
print_delete_sql > delete.sql -
Run
delete.sqlon slave env:- This will delete all data in system tables on slave
- Warning: Does NOT handle row filters
-
Update publication on master with tables to sync:
manage.py update_publication
-
Update subscription on slave:
manage.py update_subscription # This will trigger initial copy of tables! -
Resync missing FKs on slave:
manage.py resync_user_table_fks # Note: may DELETE user data for hanging rows
To create a subscription, a connection string must be provided to allow the slave db to connect to master. For example:
'dbname=reports host=master user=user password=password'
Important: Subscriptions can only be managed by superuser. Your django app must have superuser access to use the create_subscription command.
Custom Connection String:
manage.py create_subscription -c 'dbname=reports host=master user=user password=password'If no string is provided, credentials will be fetched from AWS Secrets Manager using:
REPLICATION_CONNECTION_SECRETREPLICATION_CONNECTION_SECRET_REGION(defaults to "us-east-1")
The secret should contain all the keys above (dbname will default to PROJECT_SLUG).
Note: boto3 must be installed to use this feature.
The following order is recommended:
- deploy code changes + migrate db on master
manage update_publicationon mastermanage populate_content_typeson master (see here)- deploy code changes + migrate db on slave
manage update_subscriptionon slave
On error, there will be automatic retries so a slight delay in deployment finish across master + slave should resolve itsef.
update_subscription must be called after update_publication for any new tables to be synced.
Only insert + update operations of delete models are synced automatically to slave environments. Delete operations are instead logged to the synced DeleteQueue table.
- When a delete model is deleted on the master environment, the operation is logged to the
DeleteQueuetable - A cron job that runs only on slave environments processes the delete queue
- Each object is manually deleted through the
.delete()method - Any dependent models are handled according to Django's
on_deleteconfiguration
Important: Any system model upon which user models depend (via a FK or OneToOneField) must be marked as a delete model. This package will throw an error if it detects this situation.
Note: Although the delete queue on slave environments will be deleted after processing, by default the master queue is kept indefinitely. To expire master queue rows, add the following to settings.py:
EXPIRE_DELETE_QUEUE_AFTER = 30 # daysDenormalize models are designed to accommodate denormalization in user models based on data stored in system models. When a system model used for denormalization is changed, the user models must be updated (or marked as stale to be updated later).
- When a denormalize model is updated or inserted, the operation is logged to the synced
DenormalizeQueuetable - A cron job that runs only on slave environments processes the denormalize queue
- The cron job will call
.save(), thereby triggering the denormalize post-save signal on the slave environments
Denormalize models are also delete models, so denormalization due to post_delete will also be handled.
Note: Although the denormalize queue on slave environments will be deleted after processing, by default the master queue is kept indefinitely. To expire master queue rows, add the following to settings.py:
EXPIRE_DENORMALIZE_QUEUE_AFTER = 30 # daysPostgreSQL supports syncing only a subset of columns or a subset of rows. See details in the PostgreSQL documentation.
You can specify filters via the ADDITIONAL_PUBLICATION_SETTINGS constant in settings.py. The format is {model_str: sql}. The SQL should be correctly encoded using sql from psycopg2 package.
Example:
from psycopg2 import sql
ADDITIONAL_PUBLICATION_SETTINGS = {
"dummy_app.Outcome": sql.SQL("WHERE ({col_name} != 'test')").format(
col_name=sql.Identifier("name")
)
}Important Notes:
- You may need to run
ALTER TABLE table_name REPLICA IDENTITY FULL;when using a row filter - To skip validation that blocks system models from having FKs to user models:
SKIP_VALIDATION_MODELS = ["dummy_app.Outcome"]
Duplicate primary keys or violated unique constraints will throw an exception and block all subsequent syncing.
Solutions:
- If you must insert data, use UUIDs instead of auto-incrementing primary keys
- The package provides admin utilities to prevent inserting/editing on slave environments:
@admin_register class MyModelAdmin(admin.ModelAdmin): # This decorator automatically blocks model changes on slave environments pass
Certain Django contrib models must be synced:
django_content_types- All tables in the auth app
These are synced by default as there is no reliable way to avoid this without risking Django content types becoming out of sync.
You can disable logical replication signals during testing by adding the following to settings.py:
DISABLE_LOGICAL_REPLICATION_SIGNALS = TrueImportant: Delete and denormalize models require signals to be enabled for proper synchronization.
-
Generate the
pg_dumpcommand:manage.py print_dump_system_tables
-
Drop existing system tables:
manage.py print_delete_sql | psql your_database -
Apply the dump to target environment
-
Resync foreign keys:
manage.py resync_user_table_fks
- Drop the subscriptions
- Drop the system tables
- Re-create the subscriptions (this triggers initial data copy)
- Resync foreign keys:
manage.py resync_user_table_fks
Problem: Auto-incrementing IDs can cause sync conflicts in slave environments.
Solution: Use UUIDs instead of auto-incrementing IDs when inserting data in slave environments.
Problem: Foreign key relationships may break during synchronization.
Solutions:
-
Run the resync command:
manage.py resync_user_table_fks
-
Verify system model configuration:
- Check that all required system models are properly marked
- Review foreign key relationships between system and user models
Problem: Django content types are lazily created, meaning they're only created when first accessed. If a slave environment tries to access a content type before the master environment, it will create its own content type record. This leads to unique constraint violations when the master's content type record is synced.
Solution: Force content type creation on the master environment by running:
python manage.py populate_content_typesWhen to run: Execute this command on the master environment after:
- Running migrations
- Updating the publication
- Before any slave environments access the system
This ensures content types are created on the master first and properly synced to slave environments.