Event Store and how to avoid Dirty Reads

Uncategorized

Event Store is a database engine that allow no-sql data to be stored as stream of events. The engine is based on a well know distributed algorithm called Paxos

In brief, when a write is sent to an Event Store cluster, the master will write the events and replicate them to the other nodes in the cluster. The master will then wait for a quorum of nodes (cluster_size/2 + 1) to acknowledge the write before considering the write successful.

In other words in a cluster of 3 nodes the quorum is 2 nodes. Before the Master can send back the acknowledge to the client it needs to wait the confirmation of write from at least 1 Slave node. For that reason it is possible in theory that a client can read from the other node that is not updated yet with the latest write. This situation is defined as “Dirty Read”.

If you must ensure that a client will always read the latest version of your data then you can:

  • If you use the .net client api: use the connection setting “PerformOnMasterOnly”
  • If your client doesn’t provides this method use the connection string setting “require-master=true”

In conclusion the possibility to get a Dirty Read depends on connection settings and if it can be read from any or read only from master.

Hope this helps

GDPR, Event Sourcing and immutable events

Event Sourcing, Software Development

The privacy of the data has been addressed in the early days even before the internet era. There has never been a lack of legislation. The problem is that the legislation has not been much effective to prevent data breaches nor to enforce the rights of individuals in front of companies or governments.

Looking at a bit of history we can see how the US and Europe are diverging in their approach to data privacy.

Gdpr history

Gdpr history

In the US the legislation pays a greater attention to not to block companies to do their job. This is probably the reason why the US legislation is made up of several different pieces of law. The data privacy is maintained under a sort of patchwork that often allows companies and government agencies to work around limits and take advantages of exceptions.

In Europe, there is a more attention to the rights of individuals despite the need for data required by the businesses. The main law was defined by a Data Privacy Directive defining a set of principles. Now the GDPR is a directly applicable law that from the 25th of May will be the main umbrella to protect individuals and their data.

The GDPR contains and better defines the existing protection plus it enforce the rights of individuals to claim the respect of these rights in case of a legal battle. It is not a technical document. It doesn’t innovate much in term of techniques to avoid data breaches. In fact, it considers data breaches as something possible. Data breaches will happen. What will make the difference is if your company is compliant or not with the law when the data breach has happened.

GDPR Entities

GDPR Entities

How can a company be compliant with GDPR?

Review your Privacy Policy document and explicitly ask for consent to your customer. Under this law, you can only hold personal data that are strictly required and only for the time that are required. During the review of the policy, data try to define a map of all the personal data and eventually remove or anonymize data that are not strictly required. This work will help you also to comply with the right of data portability as the individual can ask you a document that aggregates all his data.

Appoint a Data Protection Officer that will be responsible to enforce the respect of the law.

Define a strategy to handle the cases where the individual can claim rights

The Right to Erasure (Art. 17)

It entitles the data subject to have the data controller erase his/her personal data, cease further dissemination of the data, and potentially have third parties cease processing of the data.

GDPR Right to be forgotten

GDPR Right to be forgotten

The Right to Data Portability (Art. 20)

You have the right to receive your personal data from an organization in a commonly used form so that you can easily share it with another

GDPR Data Portability

GDPR Data Portability

The Right to Restriction of Processing (Art. 18)

Unless it is necessary by law or a contract, decisions affecting you cannot be made on the sole basis of an automated processing

In my words, use this right if you want to avoid being tagged on Facebook 😉

GDPR restrict data processing

GDPR restrict data processing

Your organizational GDPR strategies

  • Create or review your data privacy Policy and ask for Consent
  • If your company has >= 250 employees or works with sensitive data appoint a Data Protection Officer that will be responsible to ensure that the obligations under the GDPR are being met
  • Set a plan for each of the new rights that GDPR enforce

Software Strategies to comply with GDPR

…in case you are working with immutable events

  • Encryption
  • One stream per entity
  • Set retention period when possible
  • Provide a single view of the data

Always try to use

  • $correlationId’s
  • a map of where the personal data are stored
Encryption

Encrypt the related data for a particular person with a symmetric key
Delete the key when the person claims the Right to be Forgotten

GDPR streams encryption

GDPR streams encryption

One stream per entity

Keep the data in well defined separated streams Delete the related streams when the person claiming the Right to be forgotten

GDPR one stream per entity

GDPR one stream per entity

Set a retention period on incoming streams

Items older than the $maxage will automatically be removed from data collection streams

GDPR retention period

GDPR retention period

Provide a single view of the data
GDPR single view of data

GDPR single view of data