The term Internet of Things (IoT) is used in a variety of contexts where it is often misunderstood, because it can be replaced by other terms much better describing the matter we deal with or the definitions are not compliant with each other. Let me remind you about the very beginning of this term life.
The phrase “Internet of Things” started its life as the title of a presentation made in 1999 and aimed at explaining a new idea of radio frequency identification (RFID) in the context of the supply chain performance. It is clear that it doesn’t mean that someone has any right to control how others use the phrase, but my point is that a precise term definition is important for working together on: common rules, architecture, solutions, requirements, capabilities, limitations, etc. In practice having a common definition it is possible to check a selected technology, solution or product capabilities against requirements of the application entitled to use this term.
The main goal of this article is to contribute to the community work aimed to distinguish the IoT applications domain features. The main challenge faced up is to narrow the definition to make it unambiguous and meaningful.
In most publications I know the term IoT can be simply replaced with the following well-known terms:
- SCADA – Supervisory Control and Data Acquisition
- DCS – Distributed Control Systems
and the text still will be perfectly OK. In this context the “sensor” term plays the role of the “thing” and the nodes network is a synonym of the Internet. No one usually takes care whether we are talking about the temperature sensor in a bedroom or in a boiler drum in a power plant except that in case of a power plant sometimes the prefix Industrial is added. To make our life easier let’s forget about the “I” (Industrial) prefix at all because it doesn’t change the most important application domain features.
To illustrate the further discussion let me provide examples that can be recognized as SCADA/DCS and IoT applications respectively.
Let’s assume that an OPC UA Server exposes 123456 values representing the crude oil refining process. Using SCADA on top of this server we can monitor and manually control the process. Using DCS it is possible to implement a supervisory control algorithm to provide macro optimization.
For this scenario we can apply the following work-flow:
- The server instantiates an OPC UA Information Model for the crude oil refining process.
- All plant floor devices equipped with sensors are fetching data representing the current process state (for example the flow meter #A-4321 supporting the Modbus RTU communication protocol) and are waiting for data requests coming from the server communication engine.
- The communication engine embedded in the server polls all plant floor devices including the flow meter #A-4321 to recover the current process state.
- Finally, the OPC UA Server exposes the data (updates value attributes of the relevant variable nodes, e.g. #A-4321 object representing the virtual flow meter #A-4321) in his Address Space Management component (i.e. in the Address Space instantiated according to the Information Model of the crude oil refining process).
- OPC UA Clients connected to this server are updated in a standardized way.
Note in this scenario that the OPC UA Client and OPC UA Server can establish connection over the Internet using any existing transport protocol, e.g. HTTP, HTTPS, TCP, UDP, AMQP. Selection of the transport protocol between the client and server is negotiated and limited by the OPC UA specification. To get more read OPC Unified Architecture – Main Technological Features.
You can use the server I have exposed many years ago to test this scenario. The instruction how to connect is included in the following document:
Do you think a box (or even pack) of cigarettes could be the “thing”. It has a bar-code, so it is the source of data. Is it the sensor – NO because the bar-code reader (industrial scanner) is the sensor in this case. Can we recognize the bar-code reader as the “thing” – again the answer is NO if the goal is to provide a GLOBAL cigarettes tracking system. The same applies to drugs for example. Is it an IoT solution – my answer is YES, no doubts. Is it SCADA/DCS –the answer is NO, because the server (undelaying communication engine) cannot poll all possible places spread over the world where the box could appear. There are two reasons:
- It is impractical or even impossible to manage such a huge set of addresses.
- The server doesn’t know when to poll because the relevant data appear as an event instead of a process state value.
Assuming that the server is interested or even allowed to collect product data of one vendor only – not all codes fetched by any bar-code reader are relevant to the server.
Is the “thing” smart – I don’t think we can call the bar-code something smart. Is it controllable – NO.
The most interesting observation is that we can recognize this use case as an IoT application, but we have not mentioned OPC, AMQP, MQTT, SOA, Internet, WI-FI, wireless, Modbus, etc. at all, but only that we have important mobile data and the solution is globally scoped. It is good because we can check the available technology capabilities against this application requirements. As I have said selected communication technology is not the goal, but we must know how it scales to applications like this.
Now, let’s replace the word GLOBAL by LOCAL (for example cash desks farm in a shop) and the same application is no longer an IoT deployment, isn’t it? It is true even if the cash desks are interconnected using the IP protocol (e.g. Intranet)!
Sometimes it is stressed that any IoT application must guarantee a high level of robustness, but importance of the sensor and data robustness requirement is applicable to many kind of applications, e.g. controlling an airplane engine during flight. The same engine could be monitored and tracked after landing in any airport using the local WI-FI by uploading archival data to a central advanced analytic system (like the cigarettes box bar-code). Is it IoT? It isn’t during the flight, but the solution is life sensitive. After landing it is IoT, but the reliability of the data and data transfer is not so important, isn’t it?
My proposal of the Internet of Things definition is as follows:
Internet of Things is all about:
- mobile data fetching – how to gather the data from mobile devices (things);
- mobile data subscription – how to transfer the data over the Internet to a place where it could be processed;
- mobile data processing – how to integrate the data into a selected application to improve process behavioral performance.
Data fetching is related to a variety of last mile communication technologies, for example RFID, WI-FI, VHF, Bluetooth, etc. Subscription could be supported using messaging systems, e.g. AMQP, MQTT, etc. A good candidate for leveraging data consumption is for example OPC Unified Architecture.
Referring to previous examples, the data fetching process looks very similar in both cases – we have a data source and a sensor coupled together at some point in time responsible for sampling the data. Analyzing the examples from the application functionality point of view we cannot compare them because there are no requirements defined at all – only very general descriptions are provided. It looks like application capabilities are not relevant to the term definition. It is the reason why the terms SCADA, DCS and IoT are used interchangeably neglecting the fundamental differences between the following data acquisition patterns:
- Data polling–continuous checking of the sensors to see what state they are in, usually in multipoint or multidrop communication (a communication engine with multiple devices attached that share the same line) by sending a message to each device, one at a time, asking each to respond and send new data.
- Data subscription– senders of messages containing the process data fetched by the sensor, called publishers, do not prepare the messages to be sent directly to specific receivers, called subscribers, but instead they categorize published messages into topics without knowledge of which subscribers, if any, may receive the message. Similarly, subscribers express interest in one or more topics and only receive messages that are of interest, without knowledge of which publishers, if any, there are.
It is worth stressing that in both cases reusability of the fetched data is assured. In data polling scenario the server coupled with the communication engine may be connected to by many clients at the same time. In data subscription scenario the publisher is responsible for multicasting the data to all attached subscribers directly or indirectly using a broker.
To deploy the IoT scenario:
- the mobile data must be sent over the Internet (or Intranet) using messages;
- the payload of these messages is consumed asynchronously by a server (e.g. OPC UA Server) responsible for exposing it in an address space;
- the applications (e.g. OPC UA client) process the exposed data to reach selected key performance indicators (KPI).
It is required that the messages payload is formatted in a standardized way to be factored on the fetching site and meaningfully consumed by the analytic application (e.g. OPC UA client).
Data Acquisition patterns
In the above discussion the application functionality has been excluded as a factor, which can be used to recognize IoT applications. Now let’s analyze the impact of the data acquisition pattern on the application behavioral model.
Using data polling we must deal with the synchronous data acquisition pattern. In this case the application must follow the interactive behavioral model, because it actively polls the data source for more information by pulling data from a sequence that represents the process state in time. Such behavior is represented by an iterator, which is used to iterate through a data stream. The application is active in the data retrieval process – it controls the pace of the retrieval by sending the request messages at its own convenience. This enumeration pattern is synchronous, which means that the application might be blocked while polling the data source. Such polling pattern is similar to visiting the books shop and checking out a book. After you are done with the book, you pay another visit to check out another one. If the book is not available you must wait, but you may read what you selected.
On the other hand, in the reactive behavioral model, the application is offered more information by subscribing to a data stream, and any update is handed to it from the source. The application is passive in the data retrieval process: apart from subscribing to the source data stream, it does not actively poll the source, but merely react to the data being pushed to it. In this case, the application will not be blocked by waiting for the source to update. This is the push pattern employed by IoT. It is similar to joining a books club in which you register your interest in a particular genre, and books that match your interest are automatically sent to you as they are published. You do not need to wait but you must read what you get. Employing a push pattern is helpful in many scenarios, especially if data is available asynchronously as events.
The push model implemented by IoT requires additional resources responsible for multicasting the pushed data to all subscribers. This functionality may be accomplished by a middleware fulfilling the broker role or supported by the network infrastructure, e.g. IP multicast.
The fundamental differences between the interactive and reactive behavioral model must obviously impact the final application functionality, for example:
- process controllability;
- data destination discoverability;
Using data polling the request message may also contain data to control the state of a selected actuator. In this case the response message usually contains positive or negative acknowledge that can be used by the application as a condition selecting further activity. For example in case of communication disruption the request message may be resent. In case of actuator failure an alarm may be raised. In case of the pushing data scenario it is hard to implement remote control functionality in a similar way because the communication path is like a one-way route.
In interactive behavioral model the communication engine must have all information including addressing in advance to properly prepare request messages. The messages must be self-contained to be used by the network routing mechanism. In case of reactive behavioral model the application doesn’t know the source of data in advance. Therefore the sensors responsibility is to format the message and push it to an appropriate distribution channel. In this case the messages are not self-contained, because the information carried by them is only indirectly used by the routing mechanism.
For the polling data scenario configuration modification may be required after replacing the sensor if the data sources are not isomorphic for the data acquisition process. On the other hand the pushing data scenario requires that any replacement or modification of the data source must not need any modification of the application configuration.
From the above discussion we can derive the following dependency models:
Interactive applications (e.g. SCADA, DCS):
- The application engine depends on the synchronous data acquisition engine – data is observed as a stream of entities.
- The acquisition engine depends on the process devices (sensors or actuators), which it must know in advance.
Reactive applications (e.g. IoT):
- The application engine depends on the asynchronous data acquisition engine – data is observed as a stream of events.
- The data source depends on the data distribution channel.
- The asynchronous acquisition engine depends on the distribution channel.
- The data source and acquisition engine are associated by the data distribution channel proprietary mechanism.
There is a well suited open source project to start prototyping with this scenario.