Internship position

Scalable Knowledge Ingestion

Ref. 2021_018

Role Description

The IBM Research Europe Laboratory in Zurich is leading the development of large scale document ingestion [1] and knowledge representation [2] platforms, that run natively in the cloud. These platforms are used by Fortune500 companies to analyse their proprietary documents and accelerate their innovation. The platforms use award winning AI algorithms [3] and graph analytics [4] to provide our customers with a technological advantage.

Currently, we are building demonstration services in the domain of Material Science and Business Intelligence. If you are interested in these fields and would like to contribute to our platform (e.g. creating novel AI algorithms, scaling cloud applications, building out the demonstrations), we would be pleased to hear from you.

The candidate will work at the IBM Research Laboratory near Zurich, in the Scalable Knowledge Ingestion group during 6 months, having the opportunity to work in a unique corporate environment, acquire experience in several areas, publish in top international conferences and learn how to innovate technology.

Core activities

Our group is a diverse team with a very wide set of backgrounds and technical skills. As such, the intern will have a choice to work on any of these items:

  1. Develop or contribute to state-of-the-art AI algorithms in the space of document-conversion or Natural Language Processing.
  2. Develop and define our demonstration services in the area of Material Science and Business Intelligence.
  3. Optimize our cloud back-end for improved scaling and throughput.


  • Fluency in Python
  • Backend design is a good addition
  • Any background in Computer Science, Material-Science (Physics/Chemistry) or Econometrics is highly appreciated
  • Ability to speak and write in English fluently


IBM is committed to diversity at the workplace. With us you will find an open, multicultural environment. Excellent, flexible working arrangements enable all genders to strike the desired balance between their professional development and their personal lives.

How to apply

If any of the core activities interest you, please submit your CV and a cover letter.

[1] Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale [KDD, 2018]
[2] Corpus Processing Service: A Knowledge Graph Platform to perform deep data exploration on corpora [Applied AI Letters, 2020]
[3] Robust PDF Document Conversion Using Recurrent Neural Networks [IAAI, 2021, IAAI 'Innovative Application' Award]
[4] Stochastic Matrix-Function Estimators: Scalable Big-Data Kernels with High Performance [IPDPS, 2016, Best Paper Award]