Event Sourcing · Software Architecture

Kafka vs Event Store on scaling processing in a messaging architecture

This post is a condensed excerpt that I use for a tech talk about scaling distributed systems.

Kafka is a message broker and Event Store is a database for stream of events.

The roles of these tools has been extended and not always with good results. One can be a good simple ingestion tool while the other is a transactional data storage for event sourcing. They can both cover these roles but with not always good results. They can also be integrated in order to get the best out of each.

My order management component need to persist data for orders as streams of events. Is it better to keep events for all the aggregates in a single stream/partition or spread to one partition/stream per aggregate?

Kafka

Kafka has not been built for persisting aggregate data. A special configuration needs to be set for make data durable as they disappear after 7 days per default. It works better for ingesting data and scaling the processing.

A topic is replicated to one or more partitions hosted on different brokers/machines. Each partition is made up with a folder created on the mounted volume of the broker.

If you are streaming millions of orders for a multitude of customers, then it could be bad for your infrastructure and performances having one partition created for each order. You can better group in some way many orders per partition.

Event Store

Event Store has been built for that purpose from day 0. It persists Events on disk without need special configuration. It can host millions of streams and each stream does not cause any additional operation burden on your file system therefore you can store events for each order in a dedicated stream like ‘order-xxx’.

An Event Store database can be hosted on one machine as single instance or replicated in a cluster for High Availability.

Can I scale the processing while clients are sending data?

Kafka

With Kafka Clients/Publishers are responsible to publish messages to one or more partitions depending on the partition strategy. The DefaultPartitioner logic allow the client to use any given field in the message in order to be assigned to the same partition (example customerId) or to shred randomly the messages over the available partitions.

On the other side there could be only one consumer per partition. Each consumer keeps its own off set/position of the last batch of messages read from the topic/partition.

You can scale creating more partitions hosted on separate brokers with consumers processing in parallel (and in order) the messages one per partition. This pattern is not Competing Consumers as there could be one consumer per partition. There is no competition there.

In Kafka processing can be scaled adding more partitions per topic


Event Store

There are different subscription models available that can be used for processing. When the order of processing is a constraint that can’t be missed then CatchUp subscription works pretty much like Kafka consumers. Each subscriber needs to keep the position. Messages can come out of all the streams, one single stream or even categories of streams. This model is not scalable and there is not a built-in possibility to scale the stream over multiple instances as in Kafka.

The other subscription model is Persistent Consumers. This model allows to create groups of consumers receiving messages out of a stream or a category of streams. The server will deliver the events over all the consumers at least once. Consumers can then process events in parallel. If a consumer does not ack the process within a timeout period, then the message will be re-delivered to one of the other consumers. The position is maintained on the server per group and not per consumer like in Kafka. The server is one and it’s not partitioned over multiple brokers. You scale the processing adding consumers to the group and you can spread consumer microservices over different nodes. This allow to process in parallel events from the same stream or category of streams. The pattern is Competing Consumer. The default delivery strategy is round robin over all the member of each group. Using this subscription model in that way you can struggle with temporal coupling in case you need to process the events in order. There are techniques to deal with this. For example, the consumer can park a message to a special stream to be re process later. If that is your case, you can then change the delivery strategy and use the Pinned. With this strategy you can use a special built in projection like $bycategory or $by_correlation_id to send events to the same consumer if it has available resource for processing.

In Event Store processing can be scaled adding more consumers

Conclusions

Kafka is strong on scaling as it offers a clear and solid model. It is also available as a service and there are plenty of tools integrating with it. It could be used as scalable message bus.

Event Store implements the Competing Consumer pattern. It allows competition/scaling on the processing side without the overhead of partitioning over multiple broker boxes. It does not provide a partitioning/sharding built in feature. At least not yet. It is worth noting that a custom function can be used for sharding client side. I will write an article with an example asap.

Event Store does its best as storage for event sourced components. It is easy to build and rebuild read models applying events from all streams, from categories or using projections.

It is probably time to build around it a richer set of integration tools and make it available as cloud service.