Why Kafka?
Multiple Producers: Kafka is able to handle multiple producers using a single or many topics.
Multiple Consumers: Multiple consumers can read from a topic, without interfering with each other. This contrasts with other queuing systems where once a message is consumed by a client, it is not available to other consumers.
Durable Disk Retention: As messages are committed to disks, they are retained based on configurable rules. If a consumer falls behind, he can pick up the pace when ready with either slow processing or a burst in traffic, without worrying that the data will be lost.
Scalable: Kafka can start with a single broker and expand to a number of brokers easily with no impact to availability of the system as a whole.
High Performance: Because the system can be scaled out horizontally, it provides excellent performance under high load, reducing latency to producers and consumers.
Questions still to be researched
Here are some questions that are left to be explored:
- How can a messaging system (like Kafka) be used for delivery across geographic regions?
Given that the messaging system is in responsible for reliably delivering messages, it would have been great if there was a solution that works across regions with producers from different regions. Master-master replication is not available in Kafka, while Kafka MirrorMaker can only mirror in one direction. Hence this question needs to be further researched.
Would a solution like AWS Kinesis be the answer?