Field programmable gate arrays (FPGAs) are making their way into data centers (DC). They serve to offload and accelerate service-oriented tasks such as web-page ranking, memory caching, deep learning, network encryption, video conversion and high-frequency trading.

However, FPGAs are not yet available at scale to general cloud users who want to accelerate their own workload processing. This puts the cloud deployment of compute-intensive workloads at a disadvantage compared with on-site infrastructure installations, where the performance and energy efficiency of FPGAs are increasingly being exploited.

cloudFPGA solves this issue by offering FPGAs as an IaaS resource to cloud users. Using the cloudFPGA system, users can rent FPGAs — similarly to renting VMs in the cloud — thus paving the way for large-scale utilization of FPGAs in DCs.

The cloudFPGA system is built on three main pillars:

  • the use of standalone network-attached FPGAs,
  • a hyperscale infrastructure for deploying the above FPGAs at large scale and in a cost-effective way,
  • an accelerator service that integrates and manages the standalone network-attached FPGAs in the cloud.

Hyperscale infrastructure

To enable cloud users to rent, use and release large numbers of FPGAs on the cloud, the FPGA resource must become plentiful in DCs.

The cloudFPGA infrastructure is the key enabler of such a large-scale deployment of FPGAs in DCs. It was designed from the ground up to provide the world’s highest-density and most energy-efficient rack unit of FPGAs.

The infrastructure combines a passive and an active water-cooling approach to pack 64 FPGAs and two high-end Ethernet switches into one 19”×2U chassis.

In all, 16 such chassis fit into a 42U rack for a total of 1024 FPGAs and 16 TB of DRAM.

OpenStack accelerator service

Today, the prevailing way to incorporate an FPGA into a server is to connect it to the CPU over a high-speed, point-to-point interconnect such as the PCIe bus, and to treat that FPGA resource as a co-processor worker under the control of the server CPU.

However, because of this master–slave programming paradigm, such an FPGA is typically integrated in the cloud only as an option of the primary host compute resource to which it belongs. As a result, bus-attached FPGAs are usually made available in the cloud indirectly via the Nova compute service of OpenStack.

In our deployment, in contrast, a stand-alone, network-attached FPGA can be requested independently of a host.

However, as the standard OpenStack provides no service to make such independent FPGAs available in the cloud, we have built a new accelerator service for this purpose.

The new OpenStack service is similar to Nova, Cinder and Neutron, which translate high-level service API calls into device-specific commands for compute, storage and network resources in the cloud. However, OpenStack has been designed specifically for stand-alone accelerator resources.

Cloud integration is the process of making a resource available in the cloud. In the case of cloudFPGA, this process involves an accelerator service and a network manager.

Accelerator Service

The accelerator service comprises an application program interface (API), a database (DB) of the FPGA resources, and a scheduler.

The front-end API receives accelerator service calls from users through the OpenStack dashboard or a command-line interface.

The scheduler matches the requests of the users by searching the DB, which contains all the relevant information about the available FPGA resources.

If a match exists, the received generic service calls are translated into specific network-attached FPGA commands, which are forwarded to the relevant FPGA devices.

Network Manager

To set up network connections with stand-alone, network-attached FPGAs, we need to carry out management tasks. For this we use a software-defined network (SDN) stack, which we connect to the network service of OpenStack (i.e. Neutron). We refer to this as the network manager. The network manager provides to OpenStack a front-end API, a network topology discovery service, a virtualization layer, and an SDN controller.

The API receives network service calls from the accelerator service and exposes them to the various applications of the network manager. These applications include connection management, security and service-level agreements. The virtualization layer provides a simplified view of the overall DC network, including FPGA devices, to the above applications.

Finally, the SDN controller configures both the network-attached FPGAs and the network switches according to the commands received by the applications via the virtualization layer.

Ask the experts

François Abel

François Abel

IBM Research scientist

Jagath Weerasinghe

Jagath Weerasinghe

IBM Research scientist

Christoph Hagleitner

Christoph Hagleitner

IBM Research scientist

Publications

[1] F. Abel, J. Weerasinghe, C. Hagleitner, B. Weiss, S. Paredes, 
“An FPGA Platform for Hyperscalers,”
in IEEE 25th Annual Symposium on High-Performance Interconnects (HOTI 25), Santa Clara, CA, pp. 29–32, 2017.

[2] J. Weerasinghe, F. Abel, C. Hagleitner, A. Herkersdorf,
Disaggregated FPGAs: Network performance comparison against bare-metal servers, virtual machines and Linux containers,”
in IEEE International Conference on Cloud Computing Technology and Science (CloudCom), Luxembourg, 2016.

[3] J. Weerasinghe, R. Polig, F. Abel,
Network-attached FPGAs for data center applications,”
in IEEE International Conference on Field- Programmable Technology (FPT ’16), Xian, China, 2016.

[4] J. Weerasinghe, F. Abel, C. Hagleitner, A. Herkersdorf,
Enabling FPGAs in hyperscale data centers,”
in IEEE International Conference on Cloud and Big Data Computing (CBDCom), Beijing, China, pp. 1078–1086, 2015.