Master’s student or intern

Elastic ephemeral storage services for serverless computing

Ref. 2020_029

Serverless computing is a cloud-computing execution model in which the cloud provider dynamically manages the allocation of machine resources. As a cloud service, it is becoming increasingly popular due to its high elasticity and fine-grain billing. Serverless platforms like AWS Lambda, Google Cloud Functions, IBM Cloud Functions, or Azure Functions enable users to quickly launch thousands of light-weight tasks (as opposed to entire virtual machines), while automatically scaling compute, storage and memory according to application demands at millisecond granularity. While serverless platforms were originally developed for web microservices, their elasticity advantages in particular make them appealing for a wider range of applications such as interactive analytics and machine learning.

To enable Serverless as the new paradigm to efficiently serve any type of workload, including complex multi-stage computations, the efficient handling of intermediate (ephemeral) data becomes key. Current solutions rely on key-value stores like Redis, which are typically unable to auto-scale their resource consumption. To overcome this limitation, we aim at extending a given Serverless framework (Knative) with a highly elastic, high performance data store for ephemeral data. Choosing the Apache Crail data store, we work on adding resource elasticity capabilities and on integrating it with Knative.

Our Apache Crail-based prototype already supports dynamically adding and removing storage resources according to a serverless applications current demand. To allow scaling, it currently implements its own proprietary resource scaler, which decides whether to add or remove Crail data nodes. As such, in a Knative environment, Apache Crail starts and terminates Kubernetes PODs accordingly. Proposed as a project here, we would like to explore other techniques to do such autoscaling. One possible way is to run Apache Crail directly as a Knative service and define CRDs to monitor memory consumption and let the autoscaler add and remove datanodes based on these CRDs.

We are inviting applications from students to conduct their Master’s thesis work or an internship project at the IBM Research Europe lab in Zurich on this exciting new topic. The research focus will be on exploring techniques for efficient autoscaling of ephemeral data store services in a serverless environment. It also involves interactions with several researchers focusing on various aspects of the project. The ideal candidate should be well versed in distributed systems, and have strong programming skills (Java, C++, Python). Hands-on experience with distributed container orchestration systems (Kubernetes) and serverless environments (KNative) would be desirable.


IBM is committed to diversity at the workplace. With us you will find an open, multicultural environment. Excellent flexible working arrangements enable all genders to strike the desired balance between their professional development and their personal lives.

How to apply

If you are interested in this challenging position on an exciting new topic, please submit your most recent curriculum vitae including a transcript of grades.


Questions? Feel free to get in touch with
Dr. Nikolas Ioannou, Manager Cloud Data Platforms,
Dr. Bernard Metzler,

Apache crail