Skip to content
FullstackCodingGuy edited this page Jul 29, 2024 · 6 revisions

Kafka is a distributed event storming platform, it is more than a messaging system. It is central hub of the integration architecture.

Kafka

  • is an event ledger, keeping track of all the messages that come in
  • is distributed in nature
  • is a redundant system
  • uses Messaging System Semantics (which means, it functions similar to the messaging system)
  • ensures Clustering as core principle - employs multiple nodes to distribute the load
  • ensures Durability & Ordering Guarantees

Use cases

Kafka can be employed for some of the use cases given below

  • Asynchronous processing (where synchronization is hard)
  • Scaling ETL Jobs / Data Pipelines / Big Data Ingest
  • Processing is error-prone (ex: parsing logic might throw exceptions due to invalid payload data)
  • Event Store (to go back to retry and perform certain operations)
  • Distributed Processing

Why Kafka?

Ordering

It is important that the delivery of the messages to be in sequential order, ex: Creating Order, Updating order to be in sequential, not in other way.

Horizontal Scaling

Push Vs Pub/Sub

Operations

It is a record based operation,

  • Key, Value, Timestamp
  • Immutable
  • Append Only
  • Persisted

3 Components

  • Broker: Node in the cluster
  • Producer: Writes the records to a broker
  • Consumer: Reads records from a broker

Kafka is not doing push records to consumers, instead, consumers connects to brokers and ask for records.

Clone this wiki locally