Dr. Sophie Burkhardt

News: I am starting my own junior research group. I'm hiring PhD students and student assistants, please contact me!

This is an excellent opportunity to get some research experience and learn a lot in the area of machine learning on text data. Depending on your level of experience and prior knowledge, specific tasks will vary.

Contact: Dr. Sophie Burkhardt, burkhardt (at) informatik.uni-mainz.de

I am starting my own junior research group in the upcoming weeks/months. Until then I am a PostDoc in the Data Mining Group at the Johannes Gutenberg University of Mainz. My PhD thesis was on multi-label topic models and my current research interests are variational autoencoders, probabilistic topic models, Bayesian nonparametrics, text classification, multi-label classification, online learning, active learning, and drift detection.

Short Scientific CV

07/2020 - Head of junior research group "Semantic Disentanglement"
01/2020 ISCB and DAAD travel award for PSB conference
05/2019 Dissertation award of the department for best PhD thesis
since 11/2017 Research associate, University of Mainz
09/2017 DAAD travel stipend for ECML/PKDD
03/2014-09/2018 Ph.D. student, University of Mainz
10/2013 – 10/2017 PRIME Research scholarship
07/2012 – 01/2013 Scholarship from University of Mainz for writing the final thesis
08/2010 – 06/2011 DAAD scholarship for year abroad at University of Sussex
04/2008 – 04/2013 Study of Philosophy and Computer Science, University of Mainz



  • AISTATS 2020, ICML 2020, ECML/PKDD 2020
  • IEEE International Conference on Data Mining  (ICDM 2014, 2016, 2017, 2019)
  • International Conference on Data Sciences and Advanced Analytics (DSAA 2014)
  • European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2015, 2017)
  • ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2014, 2015, 2016, 2019)
  • Machine Learning Journal (2016)


  • Probabilistic Graphical Models and Deep Learning (2019)
  • Computational Logic (2019)
  • Software Engineering (2018)
  • Machine Learning Lab Course (2018)
  • Data Mining Lab Course(2018-2019)
  • Computational Logic (2018)
  • Software Engineering, Data Mining (2017)
  • Machine Learning Lab Course(2017-2019)
  • Machine Learning Seminar (2014 - 2019)
  • Data Mining Seminar (2013 - 2019)
  • Tutor in Programming Languages (2013, 2015)


Burkhardt, S., Siekiera, J., Glodde, J., Andrade-Navarro, M., Kramer, S. (2020) Towards identifying drug side effects from social media using active learning and crowd sourcing. In: Pacific Symposium for Biocomputing (PSB), accepted.

Burkhardt, S., Kramer, S. (2019) A Survey of Multi-Label Topic Models. In: SIGKDD Explorations.

Burkhardt, S., Kramer, S. (2019) Decoupling Sparsity and Smoothness in the Dirichlet
Variational Autoencoder Topic Model. In: Journal of Machine Learning Research 20.131, pp. 1-27. (pdf)

Code: https://github.com/sophieburkhardt/dirichlet-vae-topic-models

Burkhardt, S., Wagner, N. and Kramer, S. (2019) Extracting Rules with Adaptable Complexity from Neural Networks using K-Term DNF Optimization. In: DeCoDeML Workshop at ECML/PKDD.

Wagner, N., Burkhardt, S. and Kramer, S. (2019) A Deep Convolutional DNF Learner. In: DeCoDeML Workshop at ECML/PKDD.

Burkhardt, S., Siekiera, J. and Kramer, S. (2018) Semi-Supervised Bayesian Active Learning for Text Classification. In: Bayesian Deep Learning Workshop at NeurIPS.

Burkhardt, S. and Kramer, S. (2018) Multi-label Classification Using Stacked Hierarchical Dirichlet Processes with Reduced Sampling Complexity. In: Knowledge and Information Systems, pp. 1-23.

Burkhardt, S. and Kramer, S. (2018) Online Multi-Label Dependency Topic Models for Text Classification. In: Machine Learning 107.5, pp. 859-886.

Code available at https://github.com/sophieburkhardt/Multi-Label-Topic-Modeling

Ahmadi, Z., Burkhardt, S., Kramer, S. (2017) Online Topic Modeling: Keeping Track of News Topics for Social Good. In: Proceedings of the 2nd Workshop on Data Science for Social Good (SoGood 2017) at ECML-PKDD.

Burkhardt, S. and Kramer, S. (2017) Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models, ECML-PKDD. Skopje, Macedonia, pp. 189-204.

supplement for ECML 2017 paper

Code available at https://github.com/kramerlab/HybridHDP

Burkhardt, S. and Kramer, S. (2017) Multi-label Classification Using Stacked Hierarchical Dirichlet Processes with Reduced Sampling Complexity In: Proceedings of the 8th IEEE International Conference on Big Knowledge, Ed.: Xindong Wu, Tamer Ozsu, Jim Hendler and Ruqian Lu. Hefei, China, pp. 1-8. (best student paper)

Burkhardt, S. and Kramer, S. (2015) On the Spectrum between Binary Relevance and Classifier Chains in Multi-Label Classification, ACM SAC. Salamanca, Spain.


Room: 03-621

Johannes Gutenberg – Universität Mainz
Institut für Informatik
Staudingerweg 9
55128 Mainz, Germany

E-Mail: soburkha [at] uni [dash] mainz [dot] de

(For teaching related inquiries please write to our group account datamining [at] uni-mainz.de)
Office Phone: +49-6131-39-21059