Event Store Blog

Read-only replicas

Written by Hayley Campbell | Dec 16, 2020 2:10:59 PM

In version 20.6 of EventStoreDB, we provided the option to designate a node as a read-only replica.

A read-only replica is a node that cannot participate in elections or write to a cluster. It exists merely to replicate data from the cluster and to allow clients to read from it. Read-only replicas will enable you to scale out reads without affecting cluster performance.

Why use read-only replicas?

Read-only replicas allow you to scale out reads on a cluster without affecting performance. Read-heavy operations like subscriptions to $all or reading through large streams can cause a heavy load on a node, particularly if it's an already-busy leader node.

We would usually recommend performing reads on a follower node rather than on a leader to reduce the stress on a leader node. This approach works, but it is still possible for a client to subscribe to a follower which gets promoted to leader. Apart from this, a heavy read load on a Follower can even affect the node's performance and slow down cluster replication.

Since read-only replicas aren’t members of the cluster, they're guaranteed not to be promoted to a leader or to take part in the quorum.

Additionally, you can read from a read-only replica even when there is no leader available. It means that reads from a read-only replica won't be disrupted by node loss or elections.

Other possible uses for these nodes include using them as an additional data backup or as a reporting node.

Why not clones?

Read-only replicas replace clone nodes as our recommendation for scaling out reads. 

In fact, clone nodes are deprecated as of 20.6.0, and one must explicitly enable the clone node feature in the cluster to use clones.

Why is this? The main reason is that clone nodes are promotable. As a consequence, they can get promoted to a follower or leader node in an election.

If a network partition occurs in a cluster with a clone node, there is a high possibility of two quorums forming and resulting in a split-brain scenario with two leader nodes. Such a situation can cause issues and even data loss when the partition is restored.

Read-only replicas cannot be promoted and cannot trigger or participate in elections, and therefore do not have these same issues.

When a read-only replica comes online, it will poll the gossip endpoints until it identifies a leader node and subscribes to it. It makes them much more stable than clones and minimizes their impact on the cluster as a whole.

Are there any downsides?

Before deciding whether Read Only Replicas are right for your use-case, there are some things that you need to take into consideration.

As with any non-leader node, a Read Only Replica may lag behind the cluster. It may also lose connection to the cluster and stop getting updates. In these cases you'll still be able to read from the node, but the data would be stale.

The replica will also be subscribed to the leader node to replicate the data from the cluster. This will result in more network traffic on the leader, meaning that there is a limit to the number of replicas a cluster can support before causing performance issues.

It also means that if you are only wanting to read a small amount of data from the cluster, it would be better to just read from the leader or a follower node.

You also need to be aware that Read Only Replicas still need to be maintained like other nodes. You will still need to perform scavenges and index merges on the replicas like you would with any other node.

Configuring a read-only replica

Setting a node to act as a read-only replica is as simple as setting the `ReadOnlyReplica` option on the node to `true`.

As an example, you can add a read-only replica node to the Docker Compose file provided by the documentation here by updating it as follows:

version: '3.5'

services:
  setup:
    image: eventstore/es-gencert-cli:1.0.2
    entrypoint: bash
    user: "1000:1000"
    command: >
      -c "mkdir -p ./certs && cd /certs
      && es-gencert-cli create-ca
      && es-gencert-cli create-node -out ./node1 --dns-names node1.eventstore
      && es-gencert-cli create-node -out ./node2 --dns-names node2.eventstore
      && es-gencert-cli create-node -out ./node3 --dns-names node3.eventstore
      && es-gencert-cli create-node -out ./node4 --dns-names node3.eventstore
      && find . -type f -print0 | xargs -0 chmod 666"
    container_name: setup
    volumes:
      - ./certs:/certs

  node1.eventstore: &template
    image: eventstore/eventstore:20.6.1-buster-slim
    container_name: node1.eventstore
    env_file:
      - vars.env
    environment:
      - EVENTSTORE_EXT_HOST_ADVERTISE_AS=node1.eventstore
      - EVENTSTORE_INT_HOST_ADVERTISE_AS=node1.eventstore
      - EVENTSTORE_GOSSIP_SEED=node2.eventstore:2113,node3.eventstore:2113
      - EVENTSTORE_CERTIFICATE_FILE=/certs/node1/node.crt
      - EVENTSTORE_CERTIFICATE_PRIVATE_KEY_FILE=/certs/node1/node.key
      - EVENTSTORE_ADVERTISE_HTTP_PORT_TO_CLIENT_AS=2111
      - EVENTSTORE_ADVERTISE_TCP_PORT_TO_CLIENT_AS=1111
    healthcheck:
      test:
        [
            'CMD-SHELL',
            'curl --fail --insecure https://node1.eventstore:2113/health/live || exit 1',
        ]
      interval: 5s
      timeout: 5s
      retries: 24
    ports:
      - 1111:1113
      - 2111:2113
    volumes:
      - ./certs:/certs
    depends_on:
      - setup
    restart: always

  node2.eventstore:
    <<: *template
    container_name: node2.eventstore
    environment:
      - EVENTSTORE_EXT_HOST_ADVERTISE_AS=node2.eventstore
      - EVENTSTORE_INT_HOST_ADVERTISE_AS=node2.eventstore
      - EVENTSTORE_GOSSIP_SEED=node1.eventstore:2113,node3.eventstore:2113
      - EVENTSTORE_CERTIFICATE_FILE=/certs/node2/node.crt
      - EVENTSTORE_CERTIFICATE_PRIVATE_KEY_FILE=/certs/node2/node.key
      - EVENTSTORE_ADVERTISE_HTTP_PORT_TO_CLIENT_AS=2112
      - EVENTSTORE_ADVERTISE_TCP_PORT_TO_CLIENT_AS=1112
    healthcheck:
      test:
        [
            'CMD-SHELL',
            'curl --fail --insecure https://node2.eventstore:2113/health/live || exit 1',
        ]
      interval: 5s
      timeout: 5s
      retries: 24
    ports:
      - 1112:1113
      - 2112:2113

  node3.eventstore:
    <<: *template
    container_name: node3.eventstore
    environment:
      - EVENTSTORE_EXT_HOST_ADVERTISE_AS=node3.eventstore
      - EVENTSTORE_INT_HOST_ADVERTISE_AS=node3.eventstore
      - EVENTSTORE_GOSSIP_SEED=node1.eventstore:2113,node2.eventstore:2113
      - EVENTSTORE_CERTIFICATE_FILE=/certs/node3/node.crt
      - EVENTSTORE_CERTIFICATE_PRIVATE_KEY_FILE=/certs/node3/node.key
      - EVENTSTORE_ADVERTISE_HTTP_PORT_TO_CLIENT_AS=2113
      - EVENTSTORE_ADVERTISE_TCP_PORT_TO_CLIENT_AS=1113
    healthcheck:
      test:
        [
            'CMD-SHELL',
            'curl --fail --insecure https://node3.eventstore:2113/health/live || exit 1',
        ]
      interval: 5s
      timeout: 5s
      retries: 24
    ports:
      - 1113:1113
      - 2113:2113

  node4.eventstore:
    <<: *template
    container_name: node4.eventstore
    environment:
      - EVENTSTORE_EXT_HOST_ADVERTISE_AS=node4.eventstore
      - EVENTSTORE_INT_HOST_ADVERTISE_AS=node4.eventstore
      - EVENTSTORE_GOSSIP_SEED=node1.eventstore:2113,node2.eventstore:2113,node3.eventstore:2113
      - EVENTSTORE_CERTIFICATE_FILE=/certs/node4/node.crt
      - EVENTSTORE_CERTIFICATE_PRIVATE_KEY_FILE=/certs/node4/node.key
      - EVENTSTORE_ADVERTISE_HTTP_PORT_TO_CLIENT_AS=2114
      - EVENTSTORE_ADVERTISE_TCP_PORT_TO_CLIENT_AS=1114
      - EVENTSTORE_READ_ONLY_REPLICA=true
    healthcheck:
      test:
        [
            'CMD-SHELL',
            'curl --fail --insecure https://node4.eventstore:2113/health/live || exit 1',
        ]
      interval: 5s
      timeout: 5s
      retries: 24
    ports:
      - 1114:1113
      - 2114:2113

Connecting to a read-only replica

Options to connect to a read-only replica are available to both the gRPC and .NET TCP clients.

Using the docker example above, you can connect to the read-only replica using the .NET gRPC client with the following:

var connectionString =
    "esdb://admin:changeit@localhost:2111,localhost:2112,localhost:2113,localhost:2114?" +
    "tlsVerifyCert=false&" +
    "nodePreference=readonlyreplica";
var client = new EventStoreClient(EventStoreClientSettings.Create(connectionString));

await client.SubscribeToAllAsync((subscription, evnt, cancellationToken) =>
{
    Console.WriteLine(
        $"Received event: {evnt.OriginalEventNumber}@{evnt.OriginalStreamId} - {evnt.OriginalPosition}");
    return Task.CompletedTask;
});

And to connect to a read-only replica in the .NET TCP client:

var connectionSettings = ConnectionSettings.Create()
    .SetGossipSeedEndPoints(
        new IPEndPoint(IPAddress.Loopback, 2111),
        new IPEndPoint(IPAddress.Loopback, 2112),
        new IPEndPoint(IPAddress.Loopback, 2113),
        new IPEndPoint(IPAddress.Loopback, 2114))
    .DisableServerCertificateValidation()
    .UseCustomHttpMessageHandler(new HttpClientHandler {	
        ServerCertificateCustomValidationCallback = delegate { return true; }	
    })
    .SetDefaultUserCredentials(new UserCredentials("admin", "changeit"))
    .PreferReadOnlyReplica();

var connection = EventStoreConnection.Create(connectionSettings);
await connection.ConnectAsync();

await connection.SubscribeToAllAsync(false, (subscription, evnt) =>
{
    Console.WriteLine(
        $"Received event: {evnt.OriginalEventNumber}@{evnt.OriginalStreamId} - {evnt.OriginalPosition}");
});

Conclusion

Read-only replicas are a convenient and safe way to scale out reads on your cluster.

They have many uses, and we hope that you'll enjoy this new feature.