Introduction
Messaging enables data or commands to be sent across the network, using a “send and forget” approach. The caller can send the information and then carry on with other work, while the information is transmitted by the messaging system.
Optionally, the caller can later be notified of the result through a callback.
Asynchronous calls and callbacks can make a design more complex than a synchronous approach. However, an asynchronous call can be retried until it succeeds, which makes the communication much more reliable. Asynchronous calls also allow for throttling of requests and load balancing.
Why Use Messaging?
It is more immediate than File Transfer, better encapsulated (XML, JSON, etc.) than a Shared Database and more reliable than Remote Procedure Invocation.
It is Platform and Language Agnostic, allowing multiple systems to integrate using the lowest common denominator, such as a flat file.
Allows the sender and receiver systems to use Asynchronous Communication, unlike REST and RPC communication. The messaging service is responsible to transmitting the message reliably from the sender to the receiver. It allows the sender and receiver to process messages in their own pace, allowing them to run at maximum throughput. This is an ideal communication mechanism for distributed systems.
The messaging service can also provide Mediation services, Decoupling between the systems and even Transforming the message to make it suitable for the receiver if required.
It allows us to scale producers and consumers independantly, as well as provide a degree of fault-tolerance against processing errors.
REST and RPC do not work well for streaming a large volume of data in a pub-sub communication system, where the desire is to abstract away the location of producers and consumers. Messaging systems solves many of these problems, including the need for service discovery and distribution.
Examples of message brokers include RabbitMQ, ActiveMQ, ZeroMQ, Azure Service Bus and Amazon Simple Queue Service. Kafka on the other hand is a distributed streaming platform
Asynchronous Messaging
This is a messaging mechanism where message production by the producer is decoupled from its processing by a consumer.
It allows the producer and consumer to process messages at their own pace.
Challenges of Asynchronous Messaging
Complex Programming Model - It requires the software to use an event driven programming model; separate threads handle incoming events or call backs. Debugging becomes more difficult. Producers and consumers must make persistent connections to the messaging broker, which is very different from synchronous REST and RPC communication.
Sequence issues - If the communication is state sensitive, the software needs to re-sequence the incoming messages that it handles.
Vendor lock-in - Messaging systems usually rely on proprietary protocols.
Additional Cost and Complexity - Introducing a messaging broker system requires a separate system from the producer and consumer as well and making the overall architecture a little bit more complex.
Publish/Subscribe Subscriptions
There are generally two types of subscriptions for messaging queues:
Ephemeral: The subscription is only active as long as the consumer is up and running. Once the consumer shuts down, their subscription and yet to be processed messages are lost.
Durable: The subscription is maintained as long as it is not explicitly deleted. When the consumer shuts down, the messaging platform maintains the subscription and message processing can be resumed later.
Integration Challenges
Business applications generally focus on specific functional areas such as CRM, Billing, Finance etc. Hence, IT groups are usually organised on these functional areas.
With enterprise integration, application groups no longer control specific applications, because each application is part of an overall flow of integration applications and services.
Legacy systems are difficult to change, if they are not already integrated into the enterprise bus. They may be using proprietary integration mechanisms.
Even if these systems can be changed, maintaining them can be a daunting task.
Takeaways
The system architect must make trade-offs between introducing the cost/benefits of asynchronous messaging, the cost/benefits of persistent connections, and the cost/benefits of decoupling producers and consumers.
Traditional enterprise messaging standards such as JMS were not intended for Big Data. Today, large scale data processing is achieved using message brokers like Kafka and Amazon Kinesis.