CAP Theorem

0
17

CAP Theorem (also known as Brewer’s Theorem) states that in a distributed system, it is impossible to simultaneously guarantee Consistency, Availability, and Partition Tolerance. You can only achieve two of these three properties at the same time.

Here is a breakdown of the three components:

  1. Consistency (C):
    Every read receives the most recent write. All nodes in the system return the same, up-to-date data.
  2. Availability (A):
    Every request (read or write) receives a response, even if some nodes are down.
  3. Partition Tolerance (P):
    The system continues to operate even if there is a network partition (communication failure between nodes).

Since network partitions (P) are unavoidable in distributed systems, the CAP theorem effectively forces a trade-off between Consistency (C) and Availability (A). Based on this, distributed databases are classified into two main types:

  • CP (Consistency + Partition Tolerance): Prioritizes consistency over availability.
  • AP (Availability + Partition Tolerance): Prioritizes availability over consistency.

Databases Supporting CAP Combinations

CAP PropertyDatabase TypeExamplesExplanation
CP (Consistency + Partition Tolerance)Strongly Consistent Databases– MongoDB (in strong consistency mode) – HBase – Redis (Single-node) – ZookeeperThese systems prioritize consistency over availability when a partition occurs. They ensure all nodes agree on the latest data, but might sacrifice availability.
AP (Availability + Partition Tolerance)Highly Available Databases– DynamoDB – Cassandra – CouchDB – RiakThese systems prioritize availability over consistency during network partitions. They ensure the system remains available, but the data may not be up-to-date (eventual consistency).
CA (Consistency + Availability)Not possible in partition-tolerant systems– Traditional Relational Databases (single node): PostgreSQL, MySQLCA systems are achievable only if there are no network partitions. This is generally possible on a single-node database, where partitions don’t occur.

Why is CA Not Possible in Distributed Systems?

In a distributed environment, network partitions (P) are given because systems can fail or experience delays. Therefore, the CA combination cannot exist in a distributed database because maintaining consistency and availability simultaneously during a partition is impossible.


Summary of CAP Trade-offs

  • CP: Strong consistency, but the system may become unavailable during partitions.
    • Example: HBase, Zookeeper, etc.
  • AP: High availability but eventual consistency.
    • Example: DynamoDB, Cassandra, CouchDB.
  • CA: Only achievable in non-distributed systems (no network partitions).
    • Example: Single-node PostgreSQL or MySQL.