Database Migrations Without Downtime
Strategies for evolving schemas in production systems that can't afford to stop.
In enterprise systems processing thousands of transactions per hour, "schedule a maintenance window" isn't always an option. Database schema changes must happen while the system is running, serving users, and processing data.
The Expand-Contract Pattern
The safest approach to zero-downtime migrations is expand-contract:
Expand: Add new columns, tables, or indexes alongside existing ones. Never remove or rename in the first step.
Migrate: Backfill data, update application code to use new structures, deploy progressively.
Contract: Once all application instances use the new schema, remove the old structures.
This pattern trades speed for safety. Each migration takes longer but carries near-zero risk.
The Dual-Write Trap
Many teams attempt dual-writing — updating both old and new structures simultaneously during migration. This is harder than it sounds. Ensuring consistency across two representations, handling failures mid-write, and managing the transition window creates subtle bugs that only manifest under load.
If you must dual-write, treat it as a distributed systems problem. Use transactions where possible, idempotent operations where not, and always have a reconciliation mechanism.
Index Migrations
Adding indexes to large tables can lock the table for minutes or hours. Use concurrent index creation where your database supports it. For databases that don't, build indexes on read replicas and promote them.
Every migration is a risk. Minimize risk through small, reversible steps and comprehensive rollback plans.