IBM®
Skip to main content
    Zurich Research Laboratory      Terms of use
 
 
 
     Home      Products      Services & solutions      Support & downloads      My account     
IBM Research

RDMA host software architecture

Overview

On servers handling heavy network traffic, an offloaded transport protocol stack with support for Remote Direct Memory Access (RDMA) can eliminate a bottleneck in network input/output (I/O) by avoiding data copies between the operating system and application buffers. The Internet Engineering Task Force (IETF) is defining a set of protocols for (remote) direct data placement over IP networks. The RDMA Consortium (RDMAC) has defined the semantics of an interface to an RDMA-capable network interface card (RNIC), the so-called RDMA protocol verbs. The IETF's RDMA protocol stack, also known as the iWARP transport, is implemented on RNICs or, more generally, by verbs providers. The InfiniBand Trade Association (IBTA) is defining another transport providing RDMA services.

OS extensions and programming interfaces for RDMA represent a significant portion of the RDMA infrastructure, and their availability is a key requirement for the success of RDMA technology.

  Work on RDMA host software at ZRL

Within the Interconnect Software Consortium (ICSC) of The Open Group, we contributed to the standardization of RDMA-enabled programming interfaces, co-chairing both the Interconnect Transport API (IT-API) and the RNIC Programming Interface (RNICPI) work groups. We helped define a modular, layered, and transport-neutral host software architecture for RDMA through contributions [JAMENE-04] to an industry-driven Linux open-source project called OpenRDMA [OPENRDMA].

For the portability of high-performance RDMA-enabled applications, it is desirable for OSes to provide an open, standardized, transport-neutral and up-to-date RDMA API such as ICSC's IT-API. Similarly, for the portability of RDMA device drivers, it is desirable to converge to a standardized, syntactic programming interface that includes the iWARP feature set of ICSC's RNICPI and takes care to reconcile the semantic differences between iWARP and InfiniBand.

We have implemented elements of a host software architecture for RDMA that provides the operating system (OS) integration for both iWARP and InfiniBand, supporting IT-API and an enhanced version of RNICPI. A key property of such an architecture is a clean separation of generic/OS functionality and verbs-provider-specific software functionality into user/kernel Access Layer (uAL/kAL) and user/kernel Verbs Provider (uVP/kVP) components, respectively. This approach permits a wide range of RNICs / verbs providers to register themselves through a standard programming interface and minimizes code bloat by keeping generic functionality such as OS-wide resource management, event handling or IT-API's socket conversion within the uAL or kAL.

We are currently developing open-source software components for the RDMA host software infrastructure on Linux OS, initially focusing on memory management issues such as pinning and on the iWARP transport with its special requirements for connection management.

We are also working on a software implementation of the IETF's iWARP protocol stack RDMAP/DDP/MPA referred to as SoftRDMA, which enables RDMA on clients without RDMA hardware (supporting servers relying on RDMA for performance) and allows the testing of RNICs for iWARP protocol conformance.

Abbreviations
Open list
    back to top
    About IBM Privacy Contact