Master’s student or intern

Compiling neural networks for a computational memory AI accelerator

Ref. 2020_026

For decades, conventional computers based on the von Neumann architecture have performed computation by repeatedly transferring data between their processing and their memory units, which are physically separated. As computation becomes increasingly data-centric and as the scalability limits in terms of performance and power are being reached, alternative computing paradigms are being searched for where computation and storage are collocated. A fascinating new approach is that of computational memory where the physics of nanoscale memory devices are used to perform certain computational tasks within the memory unit in a non-von Neumann manner.

Computational memory (CM) is finding applications in a variety of application areas such as deep learning inference. At the IBM Research Europe lab in Zurich we have shown experimental demonstrations of this concept using phase-change memory devices and deploying neural network (NN) inference pipelines on such devices with higher performance and orders of magnitude improvements in power efficiency compared to traditional hardware accelerators. One of the main challenges of this approach is defining a hardware/software interface that allows a compiler infrastructure to map neural network models for efficient execution on the underlying CM accelerator. This is a non-trivial task because efficiency dictates that the CM accelerator is explicitly programmed as a dataflow engine where the execution of the different NN layers form a pipeline.

Our focus is to develop a software stack that targets such a multi-core CM chip for accelerating deep learning inference at the edge, where each CM core includes a crossbar that implements an analog Matrix Vector multiplication (MxV) operation. The accelerator follows a dataflow processing model, where inference is executed as a pipeline formed by the different NN layers.

The ultimate goal is to build a software stack that enables the transparent use of the CM accelerator in an end to end system (e.g. Figure 1), and, at the same time, guide hardware design so the accelerator can be better utilized by software. Hence, we design the compiler and the rest of the software stack in tandem with the accelerator. Specifically, we prototyped a Computational Memory Neural Network Compiler (cmnnc) that aims to compile NN models to be executed on the CM accelerator, and a simulator that models such hardware and acts as the target platform. This prototype is open source and can be found here. A key problem we face, which stems from the fact that the NN network needs to be fitted into the accelerator, is generating the control logic between the CM cores that execute different layers of the NN so that data dependencies are respected. Because traditional accelerators do not require this feature, existing ML compilers do not have facilities for tackling it. To address this challenge, we are using polyhedral compilation techniques to represent data dependencies and generate code for state machines that implement the desired control.

The scope of this project is to build a compilation framework for running inference on a multi-core CM accelerator that accepts NN models as input in ONNX format and generates the appropriate code and control logic configuration that map to the CM accelerator. The existing cmnnc prototype and documentation will be used as the basis of this work.

We are inviting applications from students to conduct their Master’s thesis work or an internship project at the IBM Research Europe lab in Zurich on this exciting new topic. The research focus will be on the dataflow graph compilation of the NN models. It also involves interactions with several researchers focusing on various aspects of the project. The ideal candidate should be well versed in compilation techniques and Deep Learning, and have strong programming skills (Python, C++). Hands-on experience with ML compilers and ONNX will be considered a bonus.

Diversity

IBM is committed to diversity at the workplace. With us you will find an open, multicultural environment. Excellent flexible working arrangements enable all genders to strike the desired balance between their professional development and their personal lives.

How to apply

If you are interested in this challenging position on an exciting new topic, please submit your most recent curriculum vitae including a transcript of grades.

 

Questions? Contact Dr. Nikolas Ioannou, Manager Cloud Data Platforms,

Computational Memory Accelerator

Phase-change memory