This is very interesting workundertaken by researchers at NAIST supported by MegaChips Corporation.I have extracted the salient points but itproves the suitability of AKIDA for use in robotics and that running theirRobust Iterative Value Conversion (RIVC) AKIDA OUT PERFORMED an Edge CPU beingthe quad-core ARM Cortex-A72 by consuming 15 TIMES LESS POWER while INCREASINGCALCULATION SPEED BY 5 TIMES. This paper was published on 23 August, 2024.
(The link: https://arxiv.org/pdf/2408.13018)
Robust Iterative ValueConversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots
Yuki Kadokawaa,∗, Tomohito Koderaa, YoshihisaTsuruminea, Shinya Nishimurab, Takamitsu MatsubaraaaNara Institute of Scienceand Technology, 630-0192, Nara, Japan MegaChips Corporation, 532-0003,Osaka, Japan
Abstract
A neurochip is a device thatreproduces the signal processing mechanisms of brain neurons and calculatesSpiking Neural Networks (SNNs) with low power consumption and at high speed.Thus, neurochips are attracting attention from edge robot applications, whichsuffer from limited battery capacity. This paper aims to achieve deepreinforcement learning (DRL) that acquires SNN policies suitable for neurochipimplementation. Since DRL requires a complex function approximation, we focuson conversion techniques from Floating Point
NN (FPNN) because it is one ofthe most feasible SNN techniques. How- ever, DRL requires conversions to SNNsfor every policy update to collect the learning samples for a DRL-learningcycle, which updates the FPNN policy and collects the SNN policy samples.Accumulative conversion errors can significantly degrade the performance of theSNN policies. We propose Robust Iterative Value Conversion (RIVC) as a DRL thatincorporates conversion error reduction and robustness to conversion errors. Toreduce them, FPNN is optimized with the same number of quantization bits as anSNN. The FPNN output is not significantly changed by quantization. To robustifythe conversion error, an FPNN policy that is applied with quantization isupdated to
increase the gap between theprobability of selecting the optimal action and other actions. This stepprevents unexpected replacements of the policy’s optimal actions. We verified RIVC’s effectivenesson a neurochip-driven robot. The results showed that RIVC consumed 1/15 timesless power and increased the calculation speed by five times more than an edgeCPU (quad-core ARM Cortex-A72). The previous framework with no countermeasuresagainst conversion errors failed to train the policies. Videos from ourexperiments are available: SHAPE \* MERGEFORMAT https://youtu.be/Q5Z0-BvK1Tc.
Keywords: neurochip, robotlearning, deep reinforcement learning
5.1. Construction of LearningSystem for Experiments 5.1.1. Entire Experiment Settings
This section describes theconstruction of the proposed framework shown in Fig. 2. We utilized a desktop PCequipped with a GPU (Nvidia RTX3090) for updating the policies and an AkidaNeural Processor SoC as a neurochip [9, 12]. The robot was controlled by the policiesimplemented in the neurochip. SNNs were implemented to the neurochip by a conversion executed by theMetaTF of Akida that converts the software [9, 12]. Samples were collected by the SNN policiesin both the simulation tasks and the real-robot tasks since the target task isneurochip-driven robot control. For learning, the GPU updates the poli- ciesbased on the collected samples in the real-robot environment. Concerning theSNN structure, the quantization of weights ws described in Eq. (16) and thecalculation accuracy of the activation functions described in Eq. (17) areverified in a range from 2- to 8-bits; they are the implementation constraintsof the neurochip [9].
7. Conclusion
We proposed RIVC as a novel DRLframework for training SNN policies with a neurochip in real-robotenvironments. RIVC offers two prominent fea- tures: 1) it trains QNN policies,which can be robust for conversion to SNN policies, and 2) it updates the valueswith GIO, which is robust to the optimal action replacements by conversion toSNN policies. We also implemented RIVC for object-tracking tasks with aneurochip in real-robot environments. Our experiments show that RIVC can trainSNN policies by DRL in real-robot environments.
Acknowledgments
This work was supported by theMegaChips Corporation. We thank Alonso Ramos Fernandez for his experimentalassistance.“
About NAIST:
“We havepublished numerous papers at top international robotics conferences (ICRA,IROS) and in top journals (RA-L, IJRR, JFR, RAS). We have also secured asubstantial amount of competitive funding, including grants from the ScienceResearch Fund (Kakenhi) for various categories and national projects (JSTMirai, JST Moonshot, NEDO). Moreover, we are engaged in numerous joint researchprojects aimed at societal implementation with companies such as Toyota MotorCorporation, Toyota Central Research and Development Labs, Honda ResearchInstitute, Yokogawa Electric Corporation, Yokogawa Digital, Hitachi ZosenCorporation, Ricoh, Toshiba, Mitsubishi Electric, MegaChips, and FurunoElectric. We also conduct collaborative research with domestic andinternational universities and research institutions, including ATR and theNational Institute of Advanced Industrial Science and Technology (AIST)”
SHAPE \* MERGEFORMAT https://isw3.naist.jp/Research/ai-rl-en.html
For any newer shareholder MegaChips purchased an AKIDA IP licence at the end of2022.At the time the founder and CTO Petervan der Made commented that the market did not fully understand thesignificance of the MegaChips engagement.
My opinion only DYOR
Fact Finder