loader image

Research and Publications

Engineering a Manycore Processor for Edge Computing

Summary form only given, as follows. A complete record of the panel discussion was not made available for publication as part of the conference proceedings. Edge computing applications such as autonomous driving systems (ADS) and 5G radio access network (RAN) require significant computing capabilities and predictable response times, while being constrained by size, weight and power (SWaP). Such applications significantly benefit from computing platforms based on manycore processors. We first expose the differences between multi-core architectures and many-core architectures, currently mainly represented by GPGPU processors. Then, by using the MPPA3 processor from Kalray as an illustration, we present some of the challenges and the choices involved by engineering an edge processing computing platform based on a manycore architecture. On the local architecture, energy efficiency and time predictability can be leveraged from a Fisher-style VLIW architecture. Accelerating deep learning inference is achieved by tightly coupling a tensor coprocessor. On the global architecture, the cache coherence domains are preferably localized to the compute units. These compute units are connected by a network-on-chip capable of multi-casting, where deadlock-free routing requires some care. The computing platform is completed by providing standard and open programming environments. Among these, OpenCL, Open VX and OpenMP appear as the most relevant for compute-intensive edge applications, once these environments are enabled to efficiently exploit the compute unit local memories of the manycore architecture.