Tuesday, 22 November 2022

Azure Cosmos DB Consistency Levels - Strong, Bounded Staleness, Session, Consistent Prefix and Eventual

 Relational databases follow the ACID properties – Atomicity, Consistency, Isolation, and Durability to ensure data integrity when data is added, changed or removed. For example, if you deposit money in a bank account and its corresponding transaction is committed, the relational database must ensure that the new value (amount) is stored and consistent for all transactions. 

For Azure Cosmos DB, data can be written to multiple replicas spread geographically to multiple datacenters and therefore it is essential to understand the different consistency levels and how this impacts data reads and writes. Let’s explore the consistency levels in Azure Cosmos DB to allow you to make a good decision on which would work best for your data.

Solution

Azure Cosmos DB as a distributed database relies on replication for high availability, low latency, and throughput across replicas. The applications can fetch data from the nearest replica for low latency and high response time. Replicas are a copy stored in the Azure region for disaster recovery or geo-distribution, but there are some differences on how data is replicated.

Azure Cosmos DB consistency defines how data is committed to different replicas and data consistency for reads across replicas. Suppose you have configured Cosmos DB read/write requests in the US East region and users access records from different regions – Europe and Asia. If you want to read data in any configured region without data loss, you need to consider geographical distance for data distribution.

Default Consistency

Click on the Default consistency option in the Azure Cosmos DB account and it shows the available consistency levels. Session consistency is the default consistency level.

Consistency Models in Azure Cosmos DB

Consistency Models in Azure Cosmos DB

Azure Cosmos DB has the following consistency levels:

  • Strong
  • Bounded Staleness
  • Session
  • Consistent Prefix
  • Eventual

The following image shows the consistency level related to high availability, lower latency, and higher throughput.

Consistency as a spectrum
source Microsoft

Let’s understand these consistency levels and the difference between them.

Strong Consistency

"Strong consistency is the most predictable and intuitive programming model and guarantees linearizability."

The strong consistency offers guaranteed reads to the most recent committed version of the item. The client will not get an uncommitted write or partial write from a replica.

The strong consistency is suitable for applications that cannot tolerate any data loss due to downtime. It requires every write to replicate across all regions, therefore it substantially increases latency. If the distance between replicas is vast, it will take more time to write data to another Azure region. It is the strictest type of consistency of Cosmos DB.

In summary, the strong consistency model has the following properties:

  • Highest Consistency
  • Lowest Performance
  • Lowest Availability

The following image illustrates data replicated to East US2 and Australia East and the users get the latest data (#1) in each region.

Strong Consistency

Suppose a user updates data (#2) in West US2. Data gets replicated to East US2, but it is not replicated yet to Australia East. In this case, all users read the old data (#1) until replication is completed to all replicas.

Strong Consistency

Once the data commits across all replicas, users see the newly committed data (#2) in all regions.

Strong Consistency

Bounded Staleness consistency

"Bounded Staleness consistency is used mainly by globally distributed applications that expect low write latencies with total global order guarantees."

The Bounded Staleness consistency trades delays for strong consistency. We can specify maximum lag (time) or maximum lag (operations).

Bounded Staleness consistency

Maximum lag (time):

  • Single region: It must be between 5 seconds and 1 day for a single region account.
  • Multiple regions: It must be between 5 minutes and 1 day for accounts with multiple regions.

Maximum Lag (operations)

  • Single region: It must be between 10 and 1,000,000 operations for accounts with a single region.
  • Multiple regions: It must be between 100,000 and 1,000,000 for accounts with multiple regions.

The following image shows a user updates data in West US 2. The data sync to the replicated regions East US 2 and Australia East happens after the configured maximum operations or maximum lag time. Until the replication is completed, all users will get old data (#1).

Bounded Staleness consistency

The Bounder Staleness consistency model is suitable for globally distributed applications requiring low write latencies with a total global order guarantee. For example, stock ticker, group collaboration, publish-subscribe, queuing, etc.

Session Consistency

"Session consistency is the widely used consistency level for single regions and globally distributed applications."

It provides good performance and availability.

The session consistency ensures the following in a single client session:

  • Consistent prefix
  • Monotonic reads and writes
  • Read your writes
  • Write follow reads guarantee

The following image reflects the session consistency as stated below:

  1. User updates record (#2) in West US2. All users except that specific session get old (#1) data until replication is completed.
  2. Similarly, if a user updates another record in West US2 record (#3), that user gets the latest record (#3) as shown in section 2 below. Users in East US 2 and Australia East still get old data (#2) until data sync is finished.
  3. After data sync, all users get the latest data (#3) as shown in section 3 below.

For example, with the writer in West US2, the reads in West US2 and East US2 region simultaneously read the same data in a single user session. At the same time, a user with a different client session reads data once data sync is completed, however it follows the same order of writes.

Session Consistency

The Session consistency is suitable for e-commerce applications, social media apps, and applications that require persistent user connections.

Note: It is the default consistency level for Azure Cosmos DB databases and collections.

Consistent Prefix

"Consistent prefix level guarantees that reads never see out-of-order writes."

The consistent prefix model is like the bounded staleness except without the operational or time lag guarantee. It guarantees the consistency and order of the writes. However, data is not always current. For example, if data is written in A, B, C order, the user may get A, B or A, B, C. It will not get out of order writes such as A, C or A, C, B.

The following screenshot refers to a user performing updates in the sequence of #2, #3, #4. The replicated copies also get data in a similar sequence (#2,#3,#4) so that reads will never be out of order of the writes.

Consistent Prefix

The consistent prefix provides read consistency to a specific point in time, good performance, and high availability. It is suitable for the models that can afford the lag but requires high availability with low latency.

Eventual Consistency

"Eventual consistency is a weak form of consistency where the client might get older records than the ones it had seen before overtime."

The Eventual Consistency does not guarantee the order of the data. It also does not guarantee how long the data can take for replication. As per its name, it offers consistent reads eventually.

The eventual consistency model provides high availability with low latency and the highest throughput.

The following screenshot refers to the eventual consistency where the data does not guarantee any order. If a user updates records in sequence of #2,#3,#4, the replicated region might get data in different sequence such as #4,#2,#3 or #3,#2,#4.

Eventual Consistency

It is the weakest consistency model because the client might read values older than they had read before. Therefore, it is suitable for applications that do not require guaranteed ordering. For example, count of retweets, non-threaded comments, likes, etc.

How to set consistency model for Azure Cosmos DB

In the Azure portal, go to your Azure Cosmos account and choose the required consistency model from the option – default consistency model.

Choose the required consistency model and click Save.

How to set consistency model for Azure Cosmos DB


No comments:

Post a Comment