Event Sourcing · Software Development

Cross DataCenter replication with EventStore

Note: Replication across clusters and nodes is available with Linker 

After almost 4 month after I joined the EventStore team I’m going to present my first big egg: GeoReplica. This feature has been in the air for a while and it was referred as Multi Master replication.

The GeoReplica is distributed as a fully supported closed source plugin.

At the moment that I’m writing we are in the process to migrate the EventStore codebase to .Net Core and .Net Standard. The final production ready GeoReplica will be integrated with the new codebase.
While we are waiting for the completion of the migration, GeoReplica is available as a beta plugin using the legacy EventStore codebase. If you have a commercial license you can already ask support to get a beta version and use it in your dev or test environments.

The plot

As a brief description, GeoReplica allow to replica all the events or only a subset to a remote EventStore. Remote mean that it is a separate cluster or single node. The difference between having a Non Promotable Clone or use GeoReplica is that the NPC allow to keep a warm copy “within” a cluster whether with GeoReplica you keep a warm copy “between” clusters.

Before start describing the settings and the steps to configure let me introduce few of the main Scenarios where GeoReplica can be used.

  1. Aggregate the data from one or more clusters or nodes to a single EventStore for reporting or archiving

    Cross Datacenter Replication for reporting or archiving
    One standalone ES pull the data from separate clusters and/or stand alone nodes
  2. Distribute the data to other clusters or single nodes to scale reads (Active/Passive)

    Cross Datacenter Replication Active/Passive
    Cross Datacenter Replication for scaling reads. There is one EventStore used for writes and a separate EventStore used for reads
  3. Distribute the data to other clusters or single nodes to scale reads and writes (Active/Active)

    Cross Datacenter Replication Active/Active
    Cross Datacenter Replication for scaling reads. Writes and Reads can be done on both EventStores. They can Push to each other or you can configure Push and Pull from just one of them

To familiarise with GeoReplica I would like to introduce the different replica models and the terms that we will use when configure the feature.

As a general rule, it is possible to combine different replica models in one LocalInstance. As an example you can configure an Active/Active GeoReplica in one EventStore where there will be a destination and an origin pointing to the same remote EventStore. You can see here how this way can be helpful to avoid touching existing EventStore and just configure the feature on separate new nodes.

Push model

You send the events from your local instance to one or more configured destinations

Pull model

You pull the events from one or more origins and copy the events in your local instance.

The actors

LocalInstance

All the required information to identify the instance that you are configuring with GeoReplica

Destinations

When the Push replica model is configured, we can set the required configuration for one or more EventStore to send the events.

Origins

When the Pull replica model is configured we can set up one or more EventStores to pull the events from and save them in the LocalInstance.

Filters

There is a filtering feature that allow to specify include or exclude filters based on StreamId and StreamType. It allows to specify wildcards