Distributed Computing


Fallacies of Distributed Computing:

  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. Topology doesn’t change
  6. There is one administrator
  7. Transport cost is zero
  8. The network is homogeneous

CAP theorem: states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

  • Consistency
  • Availability
  • Partition Tolerance

ACID: set of properties of database transactions intended to guarantee validity even in the event of errors, power failures, etc

  • Atomic
  • Consistency
  • Isolation
  • Durability

BASE: Basically Available, Soft state, Eventually consistency

HA = High Availability = No Downtime, Always available

SPOF = Single Point of Failure = If it fails, will stop the entire system from working

Defining failure

  • RPO = Recovery Point Objective = How much data can we loose
  • RTO = Recovery Time Objective = How long it take to Recovery

MTBF = Mean-Time-Between-Failures = Time between failures

Hystrix: latency and fault tolerance library

Data Deduplication: Technique for eliminating duplicate copies of repeating data

Types of delivery semantics:

  • at-most-once
  • at-least-once
  • exactly-once