“Edge AI” is something of a misnomer. Most smart devices, IoT, and other edge implementations don’t actually process data at the edge. Edge devices aren’t like smartphones or tablets, equipped with a processor and storage and software, able to perform compute tasks.
What most would call edge AI is cloud-based AI processing of data collected at the edge. The results are then sent back to the device and often back to the cloud for further processing, aggregation, and centralization.
Edge devices are largely “dumb” in that sense. Your smart home-automation gadget isn’t terribly smart. It records your command, from your voice, from its companion app, or from a setting you’ve configured previously. Perhaps it does some minor preprocessing. It then sends the command over its internet connection to a physical server in a data center somewhere, and it waits for its instruction to turn the light on or off.
This edge-to-cloud workflow, obviously, takes time. Fortunately, edge-to-cloud works well for many applications.
Unfortunately, AI is not one of them.
AI at the edge — true AI at the edge, meaning running neural networks on the smart device itself — is a thorny problem, or set of problems: limited processing resources, small storage capacities, insufficient memory, security concerns, electrical power requirements, limited physical space on devices. Another major obstacle to designing edge devices capable of AI processing at the edge is excessive cost. Few consumers could afford to upgrade to smart light bulbs if each one cost the equivalent of an iPhone.
But design them we must, because there is an enormous need for AI at the edge. Devices need to learn fast and make decisions in real time.
Consider a security camera that captures an image of an unattended package at the airport. The camera must decide whether the package is a threat — and quickly. Or consider an autonomous vehicle image sensor that sees an object in the road, must decide if it’s a plastic bag or a rock, and then must decide to swerve or not.
These may be extreme examples, but even in less life-or-death situations, latency and distance issues plague edge-to-cloud AI. For those of us in the industry, these can be expressed as “the speed of light over distance at 3-µs/km latency problem.” The speed of light is 299,792,458 meters per second (approximately). Each additional meter of distance adds 3 ns of latency, one way, or 3 µs/km.
By human standards, 3 ns probably doesn’t feel like much. In AI, it’s a lot. Remember, in edge-to-cloud AI, we’re talking about hundreds or thousands of kilometers.
So system designers are faced with a few workarounds. To accelerate edge-to-cloud and solve the speed of light over distance at 3-µs/km latency problem, we can reduce the distance. Moving the edge closer to the cloud/data center — well, that’s not too feasible, as, existentially, edge devices are in the field, doing their jobs wherever they’re needed.
That leaves us with moving the processing closer to where the data originates: the edge device.
One way to do that is to deploy multiple data centers and data closets, also known as lights-out data centers. Simply build data centers/closets close to the edge devices (in other words, everywhere), thereby minimizing distance and improving speed. This doesn’t sound too hard! It only requires real estate, construction, and lots of hardware. (In a managed-services setting, a data closet is sometimes marketed as “the edge,” so for the sake of clarity, they mean “the edge of the cloud,” not the edge in our sense.) Factor in the environmental and power costs of proliferating data centers and data closets around the world, and this turns out to be an expensive and unsustainable solution.Or we can dumb down the application and settle for less effective AI in exchange for reduced processing requirements.
This may (or may not) deliver improved speeds, but then the device can’t fulfill its intended purpose, which is real-time decision-making. Ultimately, this “solution” is worthless.
Or we can bite the bullet and build edge devices capable of running AI applications right there and then, on the device itself. Remember when I said this was a thorny problem? It’s thorny, but not impossible.
It’s true that edge devices are often quite small. Picture a wearable medical device. It needs to be sized appropriately for the patient’s comfort and allow them to carry out their daily activities. Into this device you need to pack sensors, CPUs, GPUs, memory, storage, networking and connectivity, batteries and power management, and perhaps a
cooling fan or other thermal monitoring. These hardware requirements are fairly substantial, given the limited space we have to work with.
Then there’s the expense. They can’t be cost-prohibitive to design — which would be likely if we had to modify and resize all those internal components to fit — and they can’t be cost-prohibitive to buy, especially in large quantities for industrial or commercial use.
Among these complicated tasks, the most challenging might be working within an extremely low power budget.
Running neural networks consumes power resources, and at the edge, power is precious. Traditional convolutional neural networks (CNNs) are notoriously compute-intensive, which translates to power-intensive. Compounding this inefficiency is that CNNs intake the data and perform a repetitive, sequential training process. Any time there is new data to learn, they have to start anew and go through a massive retraining operation.
Edge AI is better suited to the neuromorphic architecture, which processes only events for both CNNs and spiking neural networks (SNNs). This architecture is far more power-efficient, as only relevant information is identified as a notable event or “spike.” That airport security camera shouldn’t bother processing hours and hours of images in which nothing in its field of vision is moving — it should cut to the chase. SNNs can also learn on the fly as they receive new information, without the taxing retraining of CNNs. If a device is expected to adapt constantly to changes in its environment, SNNs in the neuromorphic architecture, converted from CNNs or true SNNs, have a clear advantage. (You might say that SNNs have the edge.)
Another way to maximize efficiency is offloading tasks from CPU resources onto neuromorphic processors based on cores called neural processing units (NPUs). Each NPU incorporates its own compute engine and memory and sends events to other NPUs over a mesh network, without host CPU intervention. Small in size and high in density, NPU-based processors consume ultra-low power — in micro- or milliwatts — and also alleviate some of the device space constraints.
Compared with edge-to-cloud approaches, AI-capable edge devices won’t need reliable internet connectivity, which is in short supply in the field and in transit. They also have potential cybersecurity advantages because they aren’t sending data offsite.
So there you have it: difficult, but not impossible. Thorny, but becoming more practical all the time as AI technologies and device hardware evolve.
If you keep costs from spiraling out of control and design around power efficiency, edge AI just might become a real thing