We are looking for Masters Students!

We are currently looking for Master’s students in the field of Computer science, Statistics, or bioinformatics for the topic of

Developing generative models for structured data such as molecules, proteins, genetics, and graphs.

Structured data are everywhere, text, sequences, graphs, molecules, proteins, genetics, and many others. Many real-world problem can be formulated as generating structured data. For instance, generating molecules with specific properties for drug design, and designing antibodies.

There has been a surge of research applying machine learning models for molecules, and protein design. We aim to explore various generative models, such as VAEs, autoregressive models, transformers, diffusion models, and score-based generative models for the goal-directed generation of structured data. We are looking for master students who are devoted to learning about generative models and their use cases for structured data. We are looking for someone who has basic machine-learning knowledge and basic mathematics, is experienced in coding with Pytorch or TensorFlow, and is interested in doing solid research.

Depending on the experience and interest of the student we offer projects focusing on:

  • Developing machine learning models for drug design [1,2,3]
  • Developing machine learning models for protein design
  • Exploring diffusion models for structured data generation
  • Developing benchmarks for genetic tools from the open source papers [4].
  • Any related application of generative models for structured data you are interested in

You can apply by sending a CV to this e-mail along with a short description of your motivation to join our lab.

For CBB master students at ETH, please check this form .

For other students at ETH: you need be enrolled at UZH as a mobility student, see here . If you have any questions regarding ETH regulation, please contact the Student Exchange Office: Dr. Francesca Broggi-Wüthrich, francesca.broggi@akd.ethz.ch, Tel +41 44 632 43 46.


[1] Mollaysa, Amina and Paige, Brooks and Kalousis, Alexandros, Goal-directed Generation of Discrete Structures with Conditional Generative Models, (https://proceedings.neurips.cc/paper_files/paper/2020/file/f9b9f0fef2274a6b7009b5d52f44a3b6-Paper.pdf)

[2] Mollaysa, Amina and Paige, Brooks and Kalousis, Alexandros, Conditional generation of molecules from disentangled representations (https://openreview.net/pdf?id=BkxthxHYvr)

[3] Gomez-Bombarelli, R., Duvenaud, D. K., Hernandez-Lobato, J. M, Automatic chemical design using a data-driven continuous representation of molecules (https://pubs.acs.org/doi/epdf/10.1021/acscentsci.7b00572)

[4] Predicting base editing outcomes with an attention-based deep learning algorithm trained on high-throughput target library screens (https://www.nature.com/articles/s41467-021-25375-z)

[5] Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song, Syntax-directed variational autoencoder for structured data (https://arxiv.org/pdf/1802.08786.pdf)

Michael Krauthammer

Interested in (a) moving data-driven solutions into patient care and (b) knowledge discovery from big biomedical data sources