Motivation

Some time ago, I participated to a W3C workshop about Web Payments, in which I was a panelist on the tough subject of Identity, Privacy and Security.

In my talk, I suggested that the IoT would be a great opportunity to secure payments, or even more. But before writing something specific on that subject, I felt that I had to get more into the IoT protocols and standards details.

So here are my thoughts about one widely available “ecosystem” of the IoT: MQTT.

Why MQTT

My experience with IoT protocols is limited, but I have a lifetime interest in embedded devices talking to each other. It all started in the 1980’s with personal computers crawling at a one to eight MHz of CPU power, featuring a few kilobytes of memory, and connected to each other using unreliable modems, able to deliver data at a blazing speed of 2400 bauds (or bits per second).

We find ourselves in the same situation now, except that wearables and sensors have replaced computers, and analog modems have been replaced by radio networks.

From the IoT protocols, that have all their advantages and drawbacks, I chose to study MQTT first, as it is the simplest one of the popular contenders in this field. Also it seems to me that it shines in addressing a few specific topics, which are message transportation and delivery. It comes in two “flavors”, which are MQTT and MQTT-SN, designed for non-IP networks.

For the purpose of this study, I will not address differences between MQTT and MQTT-SN, and use “MQTT” as a generic term for both.

The good

First things that are ultimately important are the publish/subscribe model, and the client/server model (more on this one later).

To properly address the variety of devices that will belong to the IoT, it seems to me that you cannot use one-size-fits-all architectures and requirements.

You cannot ask an Arduino board’s 8-bit micro-controller to use its scarce capabilities to handle a full stack of first-class security and authentication services. MQTT is a client/server architecture, where the server is asked to address most of the issues that might occur, and it’s a perfectly sane design decision.

Also, even if the current specification doesn’t explicitly address the topic of bridges (chains of MQTT brokers arranged as trees), one of the reference implementations of a MQTT broker, namely Mosquitto, provides this feature. This seems to be the right way to go: make good use of your scarce resources at the leafs of the tree, and enforce availability and security at the right level, for each step where they become an issue.

Apart for the implied architecture, here are some things that, in my opinion, really shine in MQTT design:

  • Failure is addressed from the start, with the “Will” flags, and the “Clean Session” flag, the first ones stating what should happen in the case of a network failure since the start, the second one stating what should be done when the reconnection happens.
  • The publish/subscribe model address more use cases than a peer-to-peer model.
  • The QoS levels, from “Fire and Forget” to “Unique, confirmed delivery”.

The bad

Not much in this field. The Topics naming and filtering rules are a bit confusing, but it may come from me.

The ugly

Security. Again. Authentication enforced by the use of Usernames and Passwords ? No, just no. Worse, the MQTT 3.1 specification states that “It is recommended that passwords are kept to 12 characters or fewer, but it is not required”. It has been removed in the 3.1.1 version that is currently being written, but it still leaves a bad taste in my mouth that such things could have seen the light of the day.

Fortunately, I’m not the only one that see an issue here:

So, here is my piece of advice about addressing security in MQTT: don’t.

MQTT is a specification for transportation and delivery of messages. Just keep it this way. Of course, security must be addressed somehow, but please, please, make it a subject on its own.

Let’s take an example of some well-known protocols: SMTP, POP3, HTTP and FTP. Two of them are ruling the Internet still now (for the good and the bad): HTTP and SMTP, whereas the two other ones, FTP and POP3, have been superseded, and any sensible IT manager would just refuse to support them anymore (unfortunately, IT guys are not always listened to).

Guess which protocols tried to address security, and failed? To be fair, they were perfectly acceptable at the time of their writing, but things are not meant to last forever, and when it comes to subjects such as identity, privacy and security, any small issue quickly becomes unacceptable.

Fortunately, it seems that it goes in the right direction: the security-related topics are placed into a dedicated, Non-normative chapter. Phew!

An opportunity for decentralization

Why it does matter

Security has to be addressed, of course. But beyond protecting data with cryptography, this is the subject of Identity that is the main concern, because it is all about trust.

It has been written and said so much about censorship, privacy and surveillance these last months, that I won’t bother trying to explain why relying on one or many central organizations is an issue.

The IoT is yet to be built, and we have the opportunity to define a new set of standards. That’s not something quite common, so why not take our chance at the maximum, and try to address topics such as decentralization?

The Web is a decentralized platform by nature. However, recent issues have shown that there are in fact too few companies and technologies on which it relies on.

Why not take a step forward, and enforce decentralization at the standards level?

This may sound weird, especially due to the client/server nature of MQTT, but I think it can (and should) be done.

The fate of the trusted provider

MQTT relies on providers. Of course, you’re perfectly free to choose one, which however you’ll have to trust and hope it doesn’t go out of business.

You may know the WebID standard. While perfectly suitable for the addressed topic, it somewhat fails, as there hasn’t been a lot of companies/individuals that provided and maintained providers.

Mozilla’s Persona tries to address related issues, also relies on “identity providers”, and fails as well.

A distributed future ?

“The future has already arrived. It’s just not evenly distributed yet.” - William Gibson

I think most people agree with the idea of decentralization is a good thing. Nobody has the desire of being put under control of an organization. However, it feels better than a lack of service. And building an efficient distributed system where the components are not under control is an incredibly hard task.

There have been many attempts, and some of them worked. I’m thinking about Bittorrent and cryptocurrency schemes such as Bitcoin.

I swear I’ve been looking for other ways to express why these distributed networks actually work, i.e. why people actually use them at a large scale, but I couldn’t find a better word than greed. Yes, greed. What makes the incentive-based systems actually work.

With Bittorrent, your reward is a better download speed when you contribute to upload as well, with Bitcoin, verifying other people’s transactions to ensure the blockchain’s integrity rewards you with some coins.

It may come a time when solely protecting other people’s privacy or anonymity will be a reward by itself, but meanwhile, my bet remains on greed as well. :)

EDIT: This article by Jon Evans is a quite close match of what I’m trying to explain (I wish I had found it earlier).

MQTT + DHT, paving the way for the #DIoT

“Talk is cheap. Show me the code.” - Linus Torvalds

Alright, then. I’ve started this fork of Mosquitto that uses some work of Juliusz Chroboczek on Bittorrent. It is intended to be a proof-of-concept and not much has been done yet, as I’ve been more into thinking on how to address the different issues.

  • The first issue is having a first network (and associated protocols) for peers to know the existence of others. Kademlia seems perfectly fine for this subject. Hence the DHT stuff.
  • The second issue is to provide an alternative protocol to Bittorrent, as we’re not dealing with the same kind of objects: Bittorrent deals mostly with static files, whereas we’ll be dealing with short-lived messages. There is also the issue of topic searching and matching that should be efficient. I’m thinking about different hash functions to optimize it.

Apart from this work, I’m considering another way to handle this: add a MQTT broker to the Twister project, that addresses many common subjects… After all, MQTT could be seen as some kind of microblogging protocol. :)

Stay tuned!



blog comments powered by Disqus

Published

08 May 2014