Student Projects

The diverse Projects offered below can be upgraded from a Semester Project to an MSC Project (or vice versa). Simply write to us about your expectations or propose your own Project.

MSC Project on Transformer Networks for Document Retrieval

We offer an MSc Thesis Project on a document retrieval problem. The goal is to apply state-of-art text embedding methods such as ELMo (Peters et al., 2018) and BERT (Devlin et al., 2018) to a nearest-neighbor document retrieval problem and to evaluate the retrieval performance on a large corpus of papers from the open access domain. Evaluate other methods such as ULMFIT and compare to powerful baseline methods such as Smooth Inverse Frequency, Concatenated Power Mean, Sparse Vector Densification.

You will learn about state-of-art methods in Natural Language Processing and about designing and fine-tuning state of art machine learning methods on ETH's Leonhard computer cluster

Contact: Nianlong Gu (


Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Semester Project on Named Entity Recognition

Your task is to download the list of Research Institutions from, and to perform entity linking with our database of open access publications: Associate each author of a publication to the correct institution in the list. In this project, you will learn about disambiguation methods in Natural Language Processing.

Contact: Onur Gokce (