This article is part of IoT Architecture Series – https://navveenbalani.dev/index.php/articles/internet-of-things-architecture-components-and-stack-view/
Large manufacturers have been using some automation and smart technology to streamline and optimize their processes and improve their operation and production efficiency. However, as manufacturers start moving towards the next industrial revolution (Industry 4.0 or Industrial Internet of Things(IIoT)) and technologies available today that can analyze massive volume, variety, and velocity of data generated by various machines and sensors, there arises an opportunity to streamline this information to further improve the manufacturing process and most importantly start designing and developing connected products that can enhance customer satisfaction and services and open up avenues for new financial business models.
Note – The term Industry 4.0 and Industrial Internet of Things are usually used interchangeably, but they have different context and reference. Industry 4.0 is a term coined by the German government, it marks the fourth industrial revolution and can be described as the digitalization of industrial sector, especially for manufacturing. Industrial Internet of Things is about enabling and applying IoT across industries. Also check out Industrial Internet Consortium (http://www.industrialinternetconsortium.org/) a non-profit organization, founded by AT&T, Cisco, GE, IBM and Intel to collaborate and set the architectural framework and direction for the Industrial Internet of Things
Let’s take an example of a leading elevator manufacturing company which supplies elevators across the globe. The elevators already have some instrumentation built in, like door sensor, a weight sensor which triggers an alert like beep in case of overload, etc., but the elevator company has no visibility on how the elevators are being used across the globe and therefore, raises the following important questions:
Are these elevators working as expected and utilized as per the specification?
Is there a failure condition?
What kind of failure has occurred?
How are failures to be handled?
What is the typical acceptable downtime?
Which agency is handling the failure condition?
How effective is the after-sales service in that region?
Is there a competent expertise available to handle a given failure condition?
Are the spare parts available to quickly start the restoration process?
Proper application of IoT can address the above questions by designing a connected solution that will help capture and analyze the product usage, operational and failure data and ultimately improve the customer satisfaction and services.
IoT can not only transform the end products but the entire manufacturing process right from the start where the elevators are manufactured. The supply chain process and logistics can also be streamlined to enhance operational efficiency and productivity and deliver better financial gains.
IoT is an incremental journey; it’s an evolution, and any manufacturing IoT realization can be broken down into the following five phases:
- Monitoring & Utilization
- Condition based maintenance
- Predictive Maintenance
- Optimization
- Connecting ‘connected solutions.’
Monitoring & Utilization
Monitoring and utilization are the first steps of an IoT journey. This is an umbrella phase which itself consists of many requirements.
For the large scale manufacturer, to enable seamless monitoring and utilization of their systems, the step usually comprises of:
- Asset Management
- Identifying assets that need to be monitored
- Instrumentation
- Leveraging existing instrumentation investments (if any)
- Adding new hardware capability (new sensors/actuators/microcontrollers) based on the design and requirements of the connected solution.
- Handle Connectivity
- Adding connectivity to devices as per above points (1) and (2). We would talk about various patterns, the device directly connected to the core platform; intercommunication between devices or a device gateway connected to the core platform which communicates with existing devices using a low level or existing proprietary protocols.
- Perform Monitoring
Asset Management
To start with you need to identify the set of physical assets that needs to be monitored. For example, for an elevator manufacturing company an elevator is an asset, which contains various sub-assets like doors, input control buttons (open, close, call, alarm, etc.), elevator telephone, etc. Similarly, for a connected car manufacturer, the car is an asset that contains various sub-assets like engine, brakes, tires, etc. and for any manufacturing plant, machinery equipment, conveyor systems, etc. are examples of assets that needs to be monitored. An asset contains a set of metadata, for example, a car engine can have a manufacturer’s name, capacity, year of manufacturing, etc. Asset management is perceived through asset metadata and its dependencies with other assets. Manufacturers typically have a software platform or an application to manage the lifecycle of its assets. While moving towards implementing IoT, the existing asset management design or application may not be sufficient or good enough for building next generation connected solution. Right from requirements, design to simulation, creating connected products and its lifecycle management, will require a completely new approach and a set of next generation software products to realize a connected solution. We envision a set of new emerging software products to tackle requirements for designing connected solution. For instance, understanding a dependency between a car engine, engine oil, led indicators and brakes through the system’s metadata and making use of analytics platform to perform analysis on the actual sensor data in a connected car solution, could help derive correlations easily and suggest measures to tackle failure condition. The design of connected products is a separate topic in itself and outside the scope of this book.
Instrumentation
In the manufacturing world, some kind of instrumentation is already employed, like the use case of the elevator, which we talked about earlier. The elevators already have built-in sensors, but these sensors are not connected to any platform, (the platform here maps to core platform in our architecture diagram – Refer Chapter 1) so as to enable transfer and analysis of the data. Moreover, the protocol and connectivity (maps to communication layer in our architecture diagram– Refer Chapter 1) for the various hardware components (or devices) in the elevator and their interactions would be very proprietary in nature.
Based on the requirements of the connected product, new hardware components (devices, microcontroller, sensors, etc.) might also be required. For instance, in a connected elevator design, the elevators now have new requirements to maintain an optimum temperature for smooth functioning, taking into account surrounding external factors (external factors may vary in different regions). Now the new design could also break down an operating ambient temperature into multiple levels of degradations, monitor this remotely or via notification and use this information to schedule services. For instance, take the following example where X is the optimum temperature that needs to be maintained and if X is greater than Threshold value, the degradations process starts. Lastly, if no action is taken from the start of degradation beyond Y days, a critical failure alert message is sent to the elevator company.
X being optimum temperature,
X > Threshold Value -> Needs attention within 5 days. The elevator is still functional but with limited load. The load is cut down from 300 kg to 150 kg.
At this stage, details about the suggestive spare part changes, the location of the spare part, suggested service vendor nearest to the current location is also made available by the system. It’s easier for the system to detect the GPS coordinates of the connected system, look at the inventory and service vendors based on the region and scheduled maintenance services. At this stage, the elevator is operational but with reduced load and have controlled the movement of people using the elevator.
X > Threshold Value (Date) – Y Days –> Critical Failure alert. This is final alert to repair the defective part, along with a good time to repair the elevator based on people movement during that week and projections to ensure minimum downtime and least impact on passengers. The above is only one such example. A manufacturer could employ many such requirements, which would require design changes right from microcontrollers to adding new hardware components. Again, this is an incremental effort; one can take gradual steps by identifying and adding new hardware component and then connecting along the way to the core platform for data transmission. The data is then used to correlate and perform analysis at the core platform layer to understand failure conditions and patterns.
Handle Connectivity
There are three general connectivity patterns, which allow devices to communicate to the core platform
- Connecting device directly to core platform
- Connecting devices to an intelligent system and/or device gateway.
- Intercommunication with devices.
Based on the use cases, the connectivity option would differ. If there is a requirement to process the data locally and take action and/or a requirement to map different proprietary protocols to a standardized protocol, a device gateway is generally used which will translate the incoming protocol instructions to that of the target platform. The requirement also depends on the power consumption capacity of the device, and it may not make sense for all devices to connect directly to the core platform.
For the elevator manufacturing use case, the devices (doors, motor temperature, shaft alignment, etc.) is already instrumented and connected to a central device (microcontroller). The central device can be IoT-enabled, or a new device gateway can be installed which talks to the central device. It can be done by installing the required platform libraries and code that connect to the core platform, understands and map the data from the controller into a payload object (like JSON) and submits the payload to the core platform.
Libraries are available which supports making a device IoT-enabled, like the Eclipse-based Paho library (http://www.eclipse.org/paho/) which is an open-source client implementation of MQTT that can be installed on devices supporting C, Java, Android, Python, C++, JavaScript and NET programming model. This is of course with the assumption that the core platform supports the MQTT protocol.
The choice of library depends on the device being IoT-enabled, the programming language supported by the device (C, C++, JavaScript, etc.), the protocols supported by the core platform (MQTT, AMQP, REST, etc.) and the client library available for the device. One can also use REST style invocations to connect to the core platform. Core platform can provide SDKs for various devices that provide APIs to convert the device data into required payload supported by the core platform. For example, open source projects like Connect-The-Dots (https://github.com/Azure/connectthedots) allow devices to connect to Microsoft IoT services.
Not all data from the IoT-enabled device need to be transferred to the core platform. The IoT-enabled device gateway can employ local storage to filter out the data (like start and stop activity on each floor in case of elevators) and transfer only relevant data to the platform. We don’t want to clog the network and the platform with data that is not relevant and at the same time make sure enough data is transmitted from the systems to analyze important indicators, operational activities of various sensors, identify failures and use the historical events and data for future prediction of machines. Identifying and understanding the critical aspect of the data and prioritizing the same should be a key decision factor for building IoT applications.
Edge gateways can also be used which is geographically located closer to the devices or the device gateways, which can normalize the data before moving it to the core platform. For instance, to a global connected car manufacturer, it would make sense to have edge gateways at respective locations which can then streamline data movement to the core platform. We would see a lot of such patterns evolving in future that would enable scalability and connectivity of billion of devices.
As new production ready devices are manufactured for IoT, we envision the required firmware and connectivity code would be part of the device design and shipped with some standardized protocol support. In an ideal world, we should have converged on one standardized protocol for IoT (like the AllJoyn protocol which is gaining momentum) to make connectivity seamless, but in reality, many such standardized protocols would exist, and there would be an integration approach required to make them work seamlessly.
Another example is of water and waste water manufacturing plant which uses SCADA network to gather, monitor and process data. The manufacturing plant already employs sensors and proprietary protocols that monitor temperature, relative humidity, pH, barometric pressure, and various other environmental parameters. To be agile and scalable, traditional manufacturing systems need to adopt technologies to store and aggregate volumes of data from sensors, monitor systems in real-time, analyze the data and give out insights which were not possible earlier and eventually create predictive models to predict equipment failure or a possible outcome.
This is especially true for manufacturing companies, which might have already employed a wide variety of protocols. The ideal approach or pattern would be to install an intelligent system of gateways to convert these protocols and make them communicate securely with the core platform. Manufacturers can incrementally move their legacy devices into the realm of IoT ecosystem by connecting them to the outside world through intelligent gateways. For instance, BACnet is the widely used protocol for smart building and products like Microsoft AllJoyn Device System Bridge, allows existing devices that use BACnet to connect to an AllJoyn network, thereby enabling existing devices to connect with IoT core platform and also with new AllJoyn devices.
In future, we would see the connected product design being a key requirement as part of the manufacturing process.
Perform Monitoring
Once the devices are connected and data from the devices is made available to the core platform, the monitoring part kicks in. The device data is usually stored in a database (possibly a time series database) for further analysis and predictions and at the same time can be acted upon by the system for real-time analysis. The monitoring phase typically involves providing a dashboard to track the devices remotely across the globe and how each device is being utilized as per the specification. The specifications are available as part of the metadata we talked about it earlier in Asset Management section.
For instance, in the case of the r elevator use case, the optimum motor temperature should not be more than 40 degree Celsius or the air condition temperature inside the elevator should be at least 18 degree Celsius at peak load.
Monitoring can also be used to detect if the elevators are installed and functioning as per the specification. For instance, every manufacturer provides a checklist for regular maintenance activity that can be tracked through remote monitoring. The following is a sample checklist, which is provided by the City of Chicago – Department of Buildings for compliance purpose. As you see, most of the test requirements can be handled by adding sensors and monitoring it remotely.
In the future, environmental requirements like energy efficiency, passenger safety, and control compliance can be met through the remote monitoring and used for auditing and inspection eventually.
As the manufacturers start embracing IoT with the concept of connected products in mind, we would see a new class of products in future that will change the complete dynamics of manufacturing process. Imagine a self-test on the elevator which automatically evaluates the compliance parameters and publishes a report as part of the audit and quality procedures in a connected environment. (In short, an elevator would be compliant and secured 24 * 7).
Once the systems and devices are being monitored, next step is to use the information to provide timely maintenance of the assets based on the specification and its operating condition. We refer to it as condition-based maintenance.
Condition based maintenance
Condition-based Maintenance (CBM) is about using the actual data gathered from the devices to decide what maintenance activity needs to be performed on the physical assets being monitored.
The connected device provides a set of continuous measurements (temperature, vibrations, air pressure, heat, etc.) for the physical asset. This data along with the required operating specification of the physical assets can be used to create rules for maintenance activities and taking corrective action.
For the elevator use case, we talked about operating temperature requirement earlier as part of the instrumentation design. With the device data being available, the maintenance service can be scheduled whenever the degradation of asset starts.
For example,
X being optimum temperature,
X > Threshold Value –> Alert the service professional. The service professional can inspect the elevator remotely and approve the spare part suggested by the system. The elevators can continue to be functional under limited load, and the load sensor rule now triggers at 150 kg instead of 300 kg. This ensures at any given point; the load does not increase beyond the expected value in case of degradation.
Take another example of a scheduled maintenance service for your automobile. The service schedule is usually specified as part of the manufacturer’s operation manual based on the average operating condition rather than the actual usage and condition of the automobile. Using condition-based maintenance, the service and maintenance activity, like the oil change in your vehicle should be triggered when the service and replacement is needed based on actual, rather than a predetermined schedule.
There are two approaches to arrive at condition-based maintenance:
The first approach is by creating predetermined rules based on the actual value provided by the devices and executing the required action. For example, if the optimum temperature of an elevator is > 40 and load > 150 kg, execute load alert/beep rule and start the elevator only when the load falls below 150 kg.
The rule can be a simple rule or a combination of rules. The rules can be visually modeled using a programming language or a tool supported by the core platform. The rules are created using the parameters or fields of the device payload. In the above example, optimum temperature, elevator load are the fields defined as part of the payload.
The second approach is monitoring the values and detecting an anomaly. The anomaly detection is about identifying the data and events, which do not conform to the expected pattern as compared to other items in the data set. For example, assume you haven’t defined any rules for optimum temperature functionality and data from the devices is being collected every second, say 15, 15, 17, 18 and on the third day you see this pattern 29,.30, 30, 29…, clearly the values read on the first day are less than half of the values read on the third day. This signifies an anomaly in the system, which can trigger an alert for someone to inspect the system. Another example would be in the case of fire, where this might be detected as an anomaly by the system indicating the dramatic rise in the temperature. There could be another case where fire sensors itself could be tracking fire events. These two cases could be combined to derive a correlation and thereby enabling you to make a more precise observation.
All anomalies might not necessarily be real problems, but detecting anomaly should be a key requirement to ensure any susceptive exceptions are being caught by the system.
Tip – Anomalies can be detected using unsupervised machine learning algorithms like K-means. Libraries such as Spark MLlib provide first class support for many machine learning algorithms.
In future, we could see pre-built templates available for industry verticals which provide the domain model, rules, process flows, machine learning models, anomaly detectors and the job of the system integrator would be to map the device data into the domain model, extend the data model and customize the flow based on the client requirements.
There may be hundreds or thousands of such rules in a complex manufacturing system, and it becomes very imperative to capture such requirements as part of your connected design. The connected design phase is yet to catch on, and most of the noise is around IoT platforms and implementations. The futuristic software products will provide end-to-end IoT implementations from a connected design perspective and also provide large-scale simulations to simulate the design and the end product.
Predictive Maintenance
Predictive maintenance is the ability of the system to predict a machine failure. Predictive maintenance phase comprises of 2 parts – one is the ability to predict when the machine/asset failure would happen and secondly to perform maintenance activity before the malfunction happens. Predictive maintenance is one of the most widely discussed topics in the IoT ecosystem.
The first two phases of the manufacturing IoT involved monitoring and condition-based maintenance. These phases can provide us with enough historical data, learnings, the correlation between the data, type of failures and corrective action taken and enabled to predict possible failures and what actions needs to be performed on the concerned asset.
In many places, you would read that predictive maintenance is same or a part of condition-based maintenance. We chose to call it out separately as the scope and implementations are quite different. Both deals with ensuring the maintenance are carried out before failure. The condition based maintenance primarily use monitoring, rules, and anomaly detection techniques; while predictive maintenance takes a step further to analyze volumes of historical or trend data, correlations, and machine specifications to predict an outcome. Predicting an outcome is very complex and an ongoing task, which requires being handled separately.
A simple use case is using the information of the assets and its lifecycle and actual ‘wear and tear’ data of the parts provided through the connected devices; one can possibly predict the remaining life cycle of an asset and when should the maintenance be required. Imagine a dashboard, which lists the assets and its metadata, like manufacturing date, installed date, type, etc. along with its actual usage and maintenance activity carried out during condition based maintenance phase. It also depicts external factors and predictions on remaining life cycle of the asset and a maintenance date. These factors can be used to plan a minimum maintenance downtime, schedule spare parts delivery and ensure maintenance is executed with least impact.
Secondly, every manufacturer typically has historical maintenance records of the systems and usage data in some form, which needs to be converted into required format and can be a valuable input to predict the maintenance activity.
Going back to the elevator use case, take the example of the elevator lift cables. Can a system predict when the elevator lift cables need to be changed? Manufacturing innovations are happening in elevator cables, like using super light carbon fiber ropes that increase the lifespan of the cables, but still changing the lift cables is a costly maintenance activity and at the same time its failure can have a considerable downtime. Ensuring availability of new lift cables, specialized technicians availability, compliance check and all these factors can impact the business operations considerably.
In order to carry out any predictive maintenance for elevator lift cables, the manufacturer needs to look at what data points would be required to predict the failure. As part of its Connect product design, the manufacturer had probably installed a sensor to track the running time or distance served by the cable, a sensor to detect if the elevator is descending faster than its designated speed and to monitor the start and stop instances of the elevator. Sensor input together with the cable’s specified life expectancy can be used to predict when the lift cables need to be replaced. In an actual scenario, many more such data sets need to be provided to predict outcomes.
Predictive maintenance involves building out machine learning models based on volumes of data. Developing machine learning models require considerable time and effort. It’s virtually impossible to expect a system to devise a predictive model which is always 100% accurate (not even human operate with that level of accuracy:)), but should be considerable enough to suggest a cause of possible failure with reasonable accuracy.
Open source scalable machine learning models like Spark MLlib or commercial offerings like SPSS from IBM or Azure ML for Microsoft can aid in building predictive models. The real challenge is building feature sets (attributes) and using algorithms like Support Vector Machines, Logistic Regression, and Decision Trees or an ensemble model using multiple machine learning algorithms to predict an outcome.
The model once developed can be integrated into your IoT platform (as part of the Analytics Platform layer –refer Chapter 1) to predict outcomes in real-time. We would talk about this in detail in our next chapter as part of the services offered by various IoT platforms.
In future, we should see specialized pre-shipped predictive maintenance services targeted for various industries/verticals like connected car, elevator maintenance, wind turbines, etc. These services would provide a generalized machine learning model developed using various factors we talked about earlier. System Integrators would play a key role in building the new machine learning model or use existing machine learning models and integrate with the IoT platform. For instance, take an example of a connected car, using the OBD device (actual diagnostic data at runtime) + GPS location, along with asset metadata (like type and make of car, manufacturing date of various parts and its specifications), a generalized machine learning model can be developed which can help predict maintenance activities and failures for any car type. This assumes that you should be able to look up the metadata for the car and its specifications, for instance, the AUDI car type, model, maintenance service requirements would be different as compared to BMW or an AUDI of a different model. The generalized data model (a connected car would have different input/output parameters as compared to a connected elevator) used by the machine learning model would also be a key component in helping to build predictive models effectively.
Many manufacturers are taking a step in this direction but building predictive models with a good amount of accuracy is not an easy task and this space would see a lot of competition, partnership and innovations from manufacturers to software platform provider to system integrators.
Optimization
Optimization phase is all about identifying new insights based on the existing data that can further help refine the manufacturing process. A large volume of data generated by the devices, together with events generated by the system and various insights from predictive and condition based maintenance opens up the door for identifying and realizing new requirements, which further enriches connected solution design to derive better outcomes.
Optimization can happen during every phase viz- monitoring, condition, and predictive-based maintenance. We called this out as a separate phase as this is an important activity to track on how applying IoT optimizes the current process and the connected products. For instance, using the outcome of predictive maintenance, one can understand failure patterns better and look at corrective ways to schedule services across the globe and order spare parts effectively and in turn optimize the supply chain process.
Going back to the elevator use case, if the elevator is fully occupied and it stops at multiple floors due to passengers wanting to enter, only to find that there is no room to enter. This can annoy passengers who are inside and outside of the elevator. These kinds of pattern (which are not failure conditions) can be detected as part of monitoring phase and therefore it can be optimized by creating a rule not to stop at floors when load is full other than the floors selected by the passengers inside the elevator and notifying passengers waiting for the elevator with the appropriate status. To inspect the user has already taken another elevator, sensors can be applied to track the movement and presence of persons on each floor and share the status at runtime, which is picked up by the incoming elevator and not to stop at the corresponding floor.
Take another example of various 100 storey buildings (in future tall skyscrapers would be quite common), how would a system optimize elevators to ensure maximum passenger satisfaction and least waiting time for passengers taking the elevators, fewer stops per trip and an organized traffic flow to prevent crowding of passengers. These are the cases where optimization and innovation can play an important part, and that would mean looking at the elevator IoT solution holistically and not just relying only on data provided by the elevators. It would mean determining connected dots like passenger movements, crowd density at each floor, or even devising smarter algorithms to utilize the data available and suggest optimized steps/routes to the elevator system.
As we move into the future of a connected world, we would see various such use cases which primarily focuses on customer satisfaction and employing new innovations to solve existing problem using the connected information.
Connecting ‘connected solutions‘
In a connected world, the real innovation would happen on how the data from one connected system would be used by other connected systems and come up with new business models that we haven’t thought of so far.
For example, let’s assume the elevator manufacturing company relies on a third-party vendor for their logistics and shipment of machinery and spare parts. Getting real-time visibility into the moving parts across the globe along with the external factors could help plan the contingency better. For instance, if it takes X amount of additional time to get spare parts from Y location as compared to Z location, but due to real-time weather insight integrated system, reporting extreme weather conditions at Y location for next three days, it’s better to order spare parts from Z location to reach on time. The distance from Z location can be further optimized based on real-time notifications from traffic systems that can provide an alternate route to the manufacturing plant. Here insights from the logistics aggregation company are offered as the value added services to manufacturing systems.
Take another example of passengers waiting for an elevator, what is the best way to keep the passengers engaged and satisfied and not grumble about the delay. A customer after checking-in to the smart connected hotel and waiting for the elevator for few minutes and later having too many stops to reach at his 90th floor, in one way can be engaged by providing complimentary vouchers for the delay on his Smartphone (through beacons and hotel smart apps on mobile) or through his hotel room card (which is digitized and provide various information) or a call as soon as he reaches his room. In that way, the customer would get the sense of being instantly connected and feel that the hotel acknowledged the delay and cared about it.
Take another example of how data from the connected car solution can be used to derive real values like, traffic management, public safety, fleet management, after sales service and industries like insurance that would tap into the data and devise ‘pay per use’ model based on actual usage of the car and based on driving/behaviour pattern of the driver. The insurance underwriting process would be changed to take into account these various connected parameters to quote the insurance premium. Insurance companies might also provide various value added services like tying up with service vendors for after sales services or providing just in time insurance for a second person driving the car. Privacy and security can pose a challenge, but they can be effectively handled through service level agreements between car broker/owner and insurance companies.
The current generation does not hesitate to share information on social media. Sometimes sharing information can be tricky but often times you would want to do that to improve your experience with the connected world as every smart business then will be able to provide personalized service based on your personal preference or characteristic. It will bring out positive outcomes and benefit at large and will be appreciated by the same people thereby creating a framework of connected people and business.
In the next article, we will talk about a couple of use cases. We call this as start-up use cases, where start-ups and small organizations are tapping into IoT to create new innovative products from scratch. We would cover two such use cases – connected car and connected home.