In one of my fav examples of how the IoT can actually save lives, sensors on only eight preemies’ incubators at Toronto’s Hospital for Sick Children yield an eye-popping 90 million data points a day! If all 90 million data points get relayed on to the “data pool,” the docs would be drowning in data, not saving sick preemies.
Enter “small data.”
Writing in Forbes, Mike Kavis has a worthwhile reminder that the essence of much of the Internet of Things isn’t big data, but small. By that, he means:
“a dataset that contains very specific attributes. Small data is used to determine current states and conditions or may be generated by analyzing larger data sets.
“When we talk about smart devices being deployed on wind turbines, small packages, on valves and pipes, or attached to drones, we are talking about collecting small datasets. Small data tell us about location, temperature, wetness, pressure, vibration, or even whether an item has been opened or not. Sensors give us small datasets in real time that we ingest into big data sets which provide a historical view.”
Usually, instead of aggregating ALL of the data from all of the sensors (think about what that would mean for GE’s Durathon battery plant, where 10,000 sensors dot the assembly line!), the data is originally analyzed at “the edge,” i.e., at or near the point where the data is collected. Then only the data that deviates from the norm (i.e., is significant) is passed on to to the centralized data bases and processing. That’s why I’m so excited about Egburt, and its “fog computing” sensors.
As with sooo many aspects of the IoT, it’s the real-time aspect of small data that makes it so valuable, and so different from past practices, where much of the potential was never collected at all, or, if it was, was only collected, analyzed and acted upon historically. Hence, the “Collective Blindness” that I’ve written about before, which limited our decision-making abilities in the past. Again, Kavis:
“Small data can trigger events based on what is happening now. Those events can be merged with behavioral or trending information derived from machine learning algorithms run against big data datasets.”
As examples of the interplay of small and large data, he cites:
- real-time data from wind turbines that is used immediately to adjust the blades for maximum efficiency. The relevant data is then passed along to the data lake, “..where machine-learning algorithms begin to understand patterns. These patterns can reveal performance of certain mechanisms based on their historical maintenance record, like how wind and weather conditions effect wear and tear on various components, and what the life expectancy is of a particular part.”
- medicine containers with smart labels. “Small data can be used to determine where the medicine is located, its remaining shelf life, if the seal of the bottle has been broken, and the current temperature conditions in an effort to prevent spoilage. Big data can be used to look at this information over time to examine root cause analysis of why drugs are expiring or spoiling. Is it due to a certain shipping company or a certain retailer? Are there re-occurring patterns that can point to problems in the supply chain that can help determine how to minimize these events?”
Big data is often irrelevant in IoT systems’ functioning: all that’s needed is the real-time small data to trigger an action:
“In many instances, knowing the current state of a handful of attributes is all that is required to trigger a desired event. Are the patient’s blood sugar levels too high? Are the containers in the refrigerated truck at the optimal temperature? Does the soil have the right mixture of nutrients? Is the valve leaking?”
In a future post, I’ll address the growing role of data scientists in the IoT — and the need to educate workers on all levels on how to deal effectively with data. For now, just remember that E.F. Schumacher was right: “small is beautiful.”