Friday, 29 January 2016

CAP Theorem



Based on three trade-offs:
  1. Consistency: You get the same response if you go to multiple nodes.
  2. Availability: You get an answer for every request.
  3. Partition tolerance: System continue to process data even if a subsystem is unavailable.

In a distributed system, we can only have two of them.

A-P System. Always available, eventually consistent.
  • Even without a network failure, data replication between database node is not instantaneous.
  • At some point in the future, all nodes will see the updated data.
  • User always receive a response but may contain old data.
  • Usually the best and easiest option to build and scale.
C-P System. Always consistent, not always available.
  • Getting multinode consistency is very hard: use data store or lock service that offer these characteristics. Don't build it yourself.
  • When base nodes can't talk to each other, we cannot ensure consistency so we refuse to respond to the request (no availability).
C-A System.
  • CA system don't exist in distributed system. If your system have no partition tolerance, it's a single process run locally and can't be divided over a network.
You may have some AP capabillities and some CP capabilities mixed in the same system. This is usually happening with system build around micro-services.


No comments:

Post a Comment