Galera Cluster: Multi-Master MySQL Replication

Standard MySQL replication is asynchronous — the primary commits a transaction and immediately moves on; replicas catch up in their own time. That gap is usually milliseconds, but it's real, and if the primary crashes before a replica applies the latest transactions, those writes are gone. For most workloads this is an acceptable trade-off. For some, it isn't.

Galera Cluster takes a fundamentally different approach: every write is committed on all nodes simultaneously before it's acknowledged to the client. There is no lag, no designated primary, and no single point of failure.

How Galera Works

Galera uses a protocol called Write-Set Replication (wsrep). When a transaction commits on any node, Galera:

Packages the transaction's write-set (the actual row changes, not the SQL statements).
Broadcasts it to all other nodes simultaneously.
Runs certification on each node — checking whether the write-set conflicts with any concurrent transactions.
If all nodes certify it, every node commits. If any node detects a conflict, the transaction is rolled back on the originating node.

This means every node always has complete, consistent data. A node can fail at any moment — the remaining nodes continue operating without any failover step. When the failed node comes back, it rejoins and receives all missed transactions via an IST (Incremental State Transfer) or a full SST (State Snapshot Transfer) if it's been down too long.

Galera vs. Async Replication

The choice comes down to your tolerance for replication lag and your write pattern:

Use Galera when zero data loss on primary failure is a hard requirement, when you need true multi-master writes (any node accepts writes), or when your writes are distributed and conflicts are rare.
Use async replication when write throughput is the priority, when your workload has high write concurrency with potential row-level conflicts, or when you need geographic distribution across high-latency links (Galera's synchronous commit is sensitive to network latency).

Galera's synchronous commit adds round-trip latency to every write. On a LAN this is 1–2ms. Across regions it becomes 50–150ms per transaction — often unacceptable.

Setting Up a 3-Node Cluster

Use Percona XtraDB Cluster (PXC) — it's the most production-ready Galera implementation and has excellent tooling:

# On all three nodes (Debian/Ubuntu)
apt install -y percona-xtradb-cluster

# On all three nodes (RHEL/AlmaLinux)
dnf install -y percona-xtradb-cluster

Configure /etc/mysql/mysql.conf.d/mysqld.cnf on each node (adjust IPs and node name):

[mysqld]
# Galera settings
wsrep_on                   = ON
wsrep_provider             = /usr/lib/galera4/libgalera_smm.so
wsrep_cluster_name         = "prod_cluster"
wsrep_cluster_address      = "gcomm://10.0.1.10,10.0.1.11,10.0.1.12"
wsrep_node_address         = "10.0.1.10"   # this node's IP
wsrep_node_name            = "node1"
wsrep_sst_method           = xtrabackup-v2

# Required settings
binlog_format              = ROW
default_storage_engine     = InnoDB
innodb_autoinc_lock_mode   = 2

Bootstrap the cluster from the first node only:

# On node1 only — starts a new cluster
systemctl start mysql@bootstrap.service

# On node2 and node3 — join the existing cluster
systemctl start mysql

Once all three nodes are running, stop the bootstrap service and restart normally on node1:

# On node1
systemctl stop mysql@bootstrap.service
systemctl start mysql

Monitoring Cluster Health

Connect to any node and check the wsrep status variables:

SHOW GLOBAL STATUS LIKE 'wsrep_%';

-- The ones to watch:
-- wsrep_cluster_size      — should equal the number of nodes you expect
-- wsrep_cluster_status    — must be "Primary"
-- wsrep_connected         — must be "ON"
-- wsrep_ready             — must be "ON"
-- wsrep_local_recv_queue  — if consistently > 0, this node is falling behind

A node that shows wsrep_cluster_status = Non-Primary has lost quorum — it's isolated from the rest of the cluster and will refuse writes. This is the safety mechanism that prevents split-brain.

Limitations and Gotchas

No MyISAM support. Galera only works with InnoDB. Any MyISAM tables will not be replicated.

Tables need a primary key. Galera's row-based replication requires every table to have a primary key for certification to work correctly. Tables without one will cause errors or silent data inconsistencies.

Write conflicts and rollbacks. If two nodes concurrently modify the same row, one transaction will be rolled back. Your application must handle deadlock errors (error 1213) by retrying. This is rare in practice but needs to be accounted for.

SST and the donor. When a new or rejoining node performs a full State Snapshot Transfer using xtrabackup-v2, the donor takes a brief lock at the start of the transfer (FLUSH TABLES WITH READ LOCK) then continues serving traffic for the rest of the transfer. If you use the mysqldump SST method instead, the donor is locked for the full duration. Either way, designate a dedicated SST donor (wsrep_sst_donor) to keep the lock off your primary writer.

Flow control. If one node falls behind, Galera pauses writes on all nodes to let it catch up. Monitor wsrep_flow_control_paused — sustained values above 0 indicate a node that's struggling.

Conclusion

Galera Cluster is the right tool when you need synchronous replication, multi-master writes, and automatic node failover without an external failover manager. The trade-offs — write latency tied to network round-trip, conflict rollbacks, SST complexity — are real but manageable. Pair it with ProxySQL for connection routing and you have a highly available MySQL setup that handles node failures transparently without any manual intervention.

Running Galera in production and hitting a specific issue? Get in touch.