Publish/Subscribe message brokers enable publishers and
subscribers to communicate with each other in a loosely coupled manner. Pub/Sub
brokers have found wide adoption through Java Messaging Service (JMS), and
later through WS-Eventing and WS-Notification specifications. Read the paper "Many faces of publish/subscribe" if you are new to the topic. It is a great introduction.
A trivial deployment would involve a standalone server.
However, such a server would be a single point of failure and have limited
scalability. One solution to all these problems is a distributed broker
hierarchy, like Narada Broker (www.naradabrokering.org)
or Wihidum (http://code.google.com/p/wihidum/
). Distributed brokers, however, are more or less a research topic yet. It is
still possible to achieve the same (almost the same) using a cluster fronted
with an intelligent load balancer. I am trying to explain few such deployment
architectures in this post.
Fault tolerant deployment and handling large number of events
The Job of a publish/subscribe broker is to receive events
and deliver them to consumers who have expressed interest on the event through
a subscription. In the above setup, load balancer sends subscription messages
to all brokers, and sends event publications to a one broker. Since all
subscriptions are sent to all nodes, each node knows about all subscriptions. The node that receives the subscription
can deliver the message to the subscribers.
It is worth noting that if reliability is needed, the broker
has to write the message to the persistent storage before acknowledging the
sender when it receives a publication message.
Therefore, load balancer has to do a blocking call to the broker. Also,
note that since subscription messages are less frequent than publication
messages, broadcast incurs much smaller overhead.
Handling Large number of subscriptions
Figure 2: Partition Subscriptions across the Nodes |
Sometime there are many subscriptions with moderate number
of messages, and in such cases, one node might not be capable of handling all
the subscriptions. Figure 2 presents a deployment for such cases.
Here the load balancer will send subscription messages specific
node, but will send publication messages to all nodes. Effectively,
subscriptions are partitioned across the nodes, and events are sent to all
nodes. Each node deliver events to subscribers assigned to that node.
However, it is worth noting this works for pub/sub case, but
not for distributed queues (e.g. JMS queue) case. In that case, consumers
connect to the broker and retrieve events in contrast to pub/sub model, and
when a consumer connection to any node, the cluster has to make sure that node
has upto-date information.
If there is any downside on this approach, that is it need a intelligent application level load balancer. For most workloads this should be OK. For example, WSO2 ESB can achieve 2500/TPS on 2-core basic machine, and 10,000+ on a high end machine, which should be sufficient for most usecases.
No comments:
Post a Comment