ASAP is always a very interesting conference and this year’s iteration met every expectation I had. It mixes presentations on high performance computing applications, application specific designs (on FPGA, ASIC) and also future technologies such as DNA.
One of the main focus of this year was, again, Machine Learning. It was the main subject of more than a quarter of the presented papers and was at least mentioned in most of the others. This subject was approached from several perspectives. The first perspective is how to optimize network designs: machine learning has gained in maturity in recent years but there are still a lot of optimization to be done on existing solutions to reach the level of performance expected by next generation applications (Autonomous Driving);
Among those optimizations, two were especially interesting: sharing parameters in LSTM recurrent network and quantization of existing networks to low precision (up to binarized network). The second perspective was the design of architectures specialized in ML workload and how to improve their efficiency.
Application specific architecture seems to be the key to tackle some very intensive compute problems, although in the long run it seems every specific architecture is converging towards the same generic block and construct (DMA based explicit memory buffer, dedicated arithmetic accelerators …) which are also the main components of most current SoCs. However, application specific design of FPGA has always been a great way to explore different hardware solutions rapidly and to help the best one emerge.
Kalray offers its own dedicated accelerator for Machine Learning, integrated in its brand new MPPA® 3rd generation a.k.a. CoolidgeTM architecture. CoolidgeTM’s flexible architecture will make applying and evaluating the ML optimizations discussed at the conference quite easy.
High Level Synthesis (HLS)
Another key topic, related to application specific design, was the prevalence of High Level Synthesis (HLS). HLS which is a way to describe hardware architecture using high level software constructs, has spread widely in the academic world and is starting to see uses in industrial application. HLS builds upon the previous analysis: providing optimized hardware blocks for basic primitives and adds an easy integration flow and design process, reducing time-to-prototype and thus time-to-market.
This year the “best paper” was awarded to an application specific architecture dedicated to SLAM (simultaneous localization and mapping) a key function for robotic and autonomous driving which is very compute intensive.
Some presentations, including the opening keynote by Luis Ceze of the University of Washington, opened up the horizon by presenting and discussing new technologies such as DNA for data storage (but also computation) and Processing in Memory architecture (PiM). Those technologies, although not yet adopted by the industry, promise great opportunities and changes to the way compute is done and used. Right now, they rely on standard computing technologies for simulation and design which gives opportunities for a high performance solution like Kalray’s MPPA®. The need for the computing power provided by our manycore technology is seems more relevant than ever to solve those challenges. For example, quantum computing, presented during the emerging technologies session, was seen as a big consumer of non-quantum computing: the compute power required to simulate a quantum computer is huge, and this simulation is key to the design and tuning of quantum algorithms which is currently impossible on quantum hardware.
I had the chance to present my own research work about “Precision Adaptation for Fast and Accurate Polynomial Evaluation Generation“. This research was done at Kalray and in collaboration with University of Alaska and, Université de Perpignan / CNRS (French’s Centre National de la Recherche Scientifique). This work has been integrated into Kalray’s Open Source code generator for mathematical libraries (available on https://github.com/kalray). This work was part of a common effort in the ASAP community to improve computing on current architecture and to optimize implementation to better fit available hardware and final constraints.
Next year ASAP (31st edition) will be held in Manchester UK. I hope to continue Kalray’s effort, alongside ASAP’s community, towards more dedicated and efficient computer architecture.
The Massively Parallel Processor Array (MPPA®) is Kalray’s ground-breaking manycore technology, giving chips more processing power with less power consumption.