MQTT High Availability

MQTT High Availability

Pro Edition for Eclipse Mosquitto leverages MQTT High Availability (HA) that allows your smart automation to keep operating, providing a continuous service at all times and in all cases. 

How MQTT High Availability works

Clustering is the key to the MQTT high availability functionality. You require a minimum set of three nodes to create an MQTT broker cluster. All required information coming in and out of the broker is synchronized across the cluster at any moment. In other words, all three brokers possess the same vitals for their functioning data. When one node fails, another automatically takes over all MQTT broker operations. This process is called the MQTT failover.

MQTT broker normal operation

Mosquitto cluster

MQTT broker normal operation

One or more load balancers funnel all traffic to the current leader MQTT broker node while the others stay in follower mode (“Follower nodes”). At any given moment in time, our MQTT Cluster Management (CM) keeps synchronizing the following data within the cluster: persistent sessions, retained messages, message queues, ACL (Access Control List), and authentication information on all clients, as well as the overall cluster status.

MQTT broker failover operation

MQTT failover - MQTT broker failover

MQTT broker failover operation

The leader node fails. The MQTT broker cluster performs re-organization and assigns the role of the leader node to one of the followers. Due to the constant synchronization process, the new leader node is up-to-date on the communication status and clients’ information. Thus, it can take over seamlessly, ensuring smooth and continuous operation of the cluster. Now, the load balancers route all traffic through the new leader. 

​​MQTT broker back to normal operation

MQTT broker cluster recovered

​​MQTT broker back to normal operation

Once the initial node is restarted, it rejoins the cluster, becomes available, and takes over a follower role. Therefore, the operation is back to normal.

What is the role of a Load balancer in the MQTT High Availability setup?

A load balancer performs the crucial monitoring function for the Pro Mosquitto MQTT cluster. It is responsible for checking the availability of servers. Nodes that are not available have their ports closed. When the leader server fails, the cluster re-organizes and defines a new leader. As soon as a new leader is determined, the load balancer sends all clients there. We recommend using three load balancers with different IPs to avoid introducing a new single point of failure (SPOF) with one load balancer. Usually, the load balancers’ IPs are summarized in your Domain Name System (DNS) under a common URL. Like this, clients can still be configured for a single access point URL, although the load balancing service is redundant.

Does my use case require MQTT High Availability?

MQTT High Availability ensures that your clients can always reach the broker. But why is enabling MQTT HA so important?

Ensure continuous communication of your clients

Mosquitto High Availability use cases

Ensure continuous communication of your clients

Essentially every industrial-grade solution that relies on MQTT as a central piece of its communication should employ a high availability setup. Otherwise, MQTT becomes the single point of failure and jeopardizes your whole solution.

For instance, you used an MQTT broker to implement a smart factory solution that interprets all events as published and received messages and, based on them, performs specific actions. When your broker is offline, it won’t publish data. In other words, your smart solution won’t be functioning at all. 

Sometimes a broker node might not be reachable. Often, it is not the broker that causes it but the underlying server hardware, operating systems, network connectivity, etc. Therefore, we recommend using MQTT Broker with High Availability to ensure your solution doesn’t suffer from outages.

Avoid loss of data

Mosquitto MQTT cluster smooth operation

Avoid loss of data

Many typical MQTT clients run on constrained devices with limited resources and cannot (really) persist data. Even if a client can store data in times of missing connectivity, this configuration becomes too complex to implement since you must consider too many aspects when choosing one for your solution.

For example, you have a sensor that collects data and uses an MQTT broker to publish information on different topics. When a server goes down, and the MQTT broker loses connection, all information gathered by a sensor during the server’s downtime period will not be published by the broker and will not be received by clients. As a result, your system will lose data between when the broker lost the connection and when the connection was finally restored.

If your system were equipped with the MQTT High Availability, it would continue seamless operation since switching from one node to another takes only several seconds. As a result, all information between the two mentioned earlier data points would still be available.

How does MQTT High Availability differ from single-node systems?

Clients see the MQTT High Availability systems as single-node brokers. The clients have no idea whether they are connected to node num. 1 or num. 2. What they know is that they are connected to a broker. 

Due to performance reasons, single-node brokers access the hard drive and store current status and queues only a few times per hour. If a single-node broker fails, any changes or queue status updates after the last and before a new disk writing cycle will be lost.

The Pro Edition for Eclipse Mosquitto MQTT broker with high availability can perfectly cope with such situations and avoid data loss by performing constant synchronization. It is important to note that data synchronization in the MQTT High Availability setup follows strict conformation to the OASIS MQTT Specification V5.

In particular, single-node systems largely depend on the stability of the underlying server, hardware, and network connectivity. As soon as one of these components fails, the single-node system is not reachable anymore. In this case, the communication between clients interrupts and is unavailable until the failure is rectified. Depending on the type of malfunction, the time required for fixing can take seconds to, in extreme cases, days until replacement parts arrive. 

Instead, the MQTT High Availability system’s nodes typically operate on different servers. It is even possible to place them in different geographical regions physically. Hence, if there is an outage of network servers, the MQTT broker node itself, or any other component, the operation is automatically switched to one of the follower nodes within less than a second. The above scenario is how the Pro Mosquitto MQTT High Availability works. Moreover, it is precisely how the MQTT High Availability setup dramatically increases the overall system availability and, at the same time, ensures the smooth operation of your IoT infrastructure. 

MQTT High Availability extra questions

If you have any further questions, feel free to contact us.

Although you can run an MQTT cluster with only one load balancer on board, this setup potentially creates a single point of failure (SPOF). Instead, the default deployment of the Pro Edition for Eclipse Mosquitto MQTT High Availability is configured to operate using three load balancers available under three different IP addresses. 

Our team can also help you set up your system with a floating IP configuration. In this case, external clients will only contact a single IP address. The floating IP service, in turn, can then internally re-route the request to the three load balancer IPs. To understand whether we can enable this configuration in your environment, we have first to check your setup and what company performs the hosting provider’s role.  

Yes, you can. However, typically, we need to double-check if such a configuration would work. In this case, we look together with you at the overall system configuration and determine the feasibility. If necessary, our team is ready to support you during the system specification, implementation, and testing stages and provide other professional services on request. 

Back to top