So what is MQTT?

This article is for those of you who have already passed the “why MQTT” station, and have a preexisting commitment to MQTT, or have a set of requirements in hand that will best be satisfied by the protocol that mqtt.org brazenly headlines as: “The Standard for IoT Messaging”.

IoT—Internet of Things—things:

typically have small microcontrolers with limited processing power and memory;
often are behind firewalls and NAT routers hiding them from the internet;
may move around and have spotty network connectivity; and
are many, and then some.

What MQTT allows these IoT things to do is:

to have a very light-weight implementation of the binary wire protocol,
that publishes messages to an MQTT broker (i.e., FlashMQ),
which then passes the messages on to subscribers, and
simultaneously broker for thousands more devices—millions in the case FlashMQ.

Lightweight

MQTT is a straight-forward binary protocol that doesn't need a lot of lines of code (LoC) to implement. FlashMQ's MqttPacket class (as of FlashMQ version 1.5.0) is implemented in 1949 lines of C++ code. That is including a whole bunch of dedicated test and debug code. And the same MqttPacket is used for both parsing and generating MQTT packets.

But, as a server-side implementation of the MQTT protocol, FlashMQ has heftier requirements for MQTT packet handling than the typical MQTT client app. Therefore, MQTT client libraries can be more light-weight. The portable Eclipse Paho C client library for MQTT examplifies the latter.

Naturally, as "The Standard for IoT Messaging”, MQTT client libraries don't just exist in C. For example, the popular MQTT.js provides a JavaScript implementation, suitable both for in-browser use as well as running in Node.JS.

Brokered

Most IoT devices are behind firewalls, and, if they connect to the Internet, do so through a NAT router, meaning: these devices can easily make outgoing connections, but the firewall / NAT router will not permit incoming connections to said devices. Even if (in a hypothetical fully-IPv6 future), every IoT device was publically accessible through its own globally unique IPv6 address, then we would need every such IoT device to always be up-to-date (and updatable), and we would need to replace our NAT routers with firewalls that would somehow magically be as easy to set up as that they were secure.

Meanwhile, in the real world, gladly, IoT devices are not normally directly addressable via the internet. But, what use is an Internet of Things thing if you can't communicate with your thing from … the Internet? Your phone is not always near your device, for example, but wouldn't it be great if you could see the current status of your device whenever and wherever your phone is connected to the internet?

Enter brokering. MQTT-speaking things connect to an MQTT broker. The broker is positioned somewhere more accessible to other MQTT clients (such as your phone).

               firewall
                   |
                   |
+---------------+  | MQTT   +-------------+     MQTT    +---------------+
|     thing     |<--------->| MQTT broker |<----------->|     device    |
| (MQTT client) |  |        +-------------+             | (MQTT client) |
+---------------+  |                                    +---------------+
                   |
                   |

Brokering is not only good for passing messages through a firewall. Brokering also eliminates the need for any server software on IoT things. Which in turn means that there are fewer storage, memory and computational needs.

And because MQTT is a popular standard—I mean, the standard—there are plenty of MQTT client applications that can subscribe to MQTT topics on a broker and store the received messages in, for example, a time-series database; or aggregate the messages on a dashboard.

Publish-subscribe

There are two basic roles that an MQTT client can play, and, importantly, the same client can play both roles:

A MQTT publisher publishes messages to the broker.
A MQTT subscriber receives messages from the broker.

Messages are always published to a specific topic. A topic name looks a lot like a Unix filesystem path. my/thing would be a valid topic name. (Yeah, the full thing is called a “name” by the spec, though it looks like a path consisting of names in the eyes of Unix nerds like myself; the individual parts are called “topic levels”.)

Subscribers subscribe to specific paths, using filters like my/thing, or to a bunch of topics using a single-level (+) or multi-level (#) wildcard. The filter my/# would match my and every topic level below my, including my/something/something/deep. The filter my/+ would match my/thing and my/other-thing, but not my/level/deeper/thing.

The number of publishers and subscribers doesn't matter to the protocol. You can have a million publishers publishing to the same topic with just a handful of subscribers, or the other way around; MQTT itself has no opinion on this. Subscribers and publishers need no awareness of each other. They just talk to the broker and send or receive messages to it.

                          MQTT client
   MQTT publisher ---        |    -----MQTT client
                     \       |  /
MQTT client -------- MQTT broker---- MQTT publisher
                     /       |
      MQTT client ---        |
                        MQTT client

Herein also lies a an often overlooked problem, which I will address in a later article.

Network fault-tolerant

IoT devices are to be assumed to not have 24/7 Internet connectivity. (In fact, defensive engineers don't ever assume 100% connectivity for anything, but I digress.) MQTT, as a protocol, is very much okay with individual messages not being delivered. This laxity comes in gradations, called “Quality of Service (QoS) levels”:

QoS 0 messages are delivered at most once. The receiver (which can be either the broker or the subscriber) does not acknowledge the receival of QoS 0 messages. Therefore there is no guarantee that QoS 0 messages are even delivered once.
QoS 1 guarantees that the message is received at least once by each client subscribed to the topic that the message has been published to.
QoS 2 guarantees that the messages is received exactly once by each client subscribed to the topic that the message has been published to.

It is important to realize that there are plenty of scenarios in which QoS 1 or QoS 2 are not worth the overhead or may even be harmful (when there's no need to have a full backlog of real-time metrics in a time-series database, for example).

It is also important to realize that you have to take these QoS levels with a grain of salt, as described in Shortcomings of MQTT: QoS article.

Payload-agnostic

MQTT makes no prescriptions about the payload of MQTT messages. There are some standards built on top of MQTT that add specific semantics—proposals such as SemSub, Sparkplug B, and OPC UA. But, the MQTT spec leaves developers free to implement whichever payload they please.

For example, shortly after the release of the first public FlashMQ version, Wiebe streamed video over MQTT to demonstrate the relative speed of FlashMQ.