Allgemein

News and events in Q3/2024

Current activities:

  • On August 22, 2024, Prof. Stefan Kramer was appointed "KI-Lotse" (roughly translated as "AI guide") for the life sciences by Alexander Schweitzer, the Prime Minister of the State of Rhineland-Palatinate. In his new role, Prof. Kramer will be advising partners from either AI or the life sciences regarding collaborations and the state government regarding the interface between the two disciplines. Congratulations and good luck with his plans to further strengthen the interaction between AI and the life sciences!

 

Gutenberg Workshop on AI for Scientific Discovery:

  • From 2-4 September the Gutenberg Workshop on AI for Scientific Discovery will take place in Weingut Wasem with an excellent line-up of invited speakers. The workshop was scientifically organized by Professor Peter Baumann and Professor Stefan Kramer of JGU Mainz. It will dive into a fascinating journey through the world of automated scientific discovery, driven by artificial intelligence. Since its inception, AI research has been deeply intertwined with the pursuit of uncovering scientific insights, a synergy that has gained remarkable momentum since the late 1970s. Fueled by advancements across various domains, AI now stands at the forefront of reshaping the landscape of scientific inquiry. The workshop focuses on cutting-edge topics, such as automation and autonomy in science, deep learning and foundation models for science, and the automated discovery of interpretable scientific knowledge. The workshop is organized around the following four themes:
    • Automation and autonomy in science
    • Applications of AI
    • Equation discovery, symbolic regression, and the induction of process models
    • Integration effort
  • The list of speakers includes some of the pioneers of the field:
    • Pat Langley (who started the field of AI for Scientific Discovery and wrote the groundbreaking book about it together with Herbert Simon, the only person to win both a Nobel and a Turing award)
    • Burkhard Rost (the first person to apply neural networks -- together with alignments -- successfully to protein data, to achieve the first breakthrough in secondary structure prediction)
    • Ross D. King (the first person to build a completely autonomous robot scientist)
    • Sašo Džeroski (who established equation discovery as a field and achieved a major breakthrough by employing context-free grammars for that purpose)
  • Additionally, some of our group members and associates will present their current research:
    • Jannis Brugger will show how equation discovery can profit from a supervised learning setting instead of a reinforcement one. Furthermore, he will highlight the important task of how to embed tabular data.
    • Mattia Cerrato developed together with his seminar students a testbed for AI-driven scientific discovery, called Science-Gym. The benchmark foster physical understanding of the tasks, by having agents autonomously perform data collection, experimental design, and equation discovery.
    • Cedric Cerstoff worked on an extension for Monte-Carlo tree search that allows the exclusion of already explored subtrees or leaves, resulting in a broader search, while maintaining identical computational resources.
    • Marius Köppel will highlight the use of AI in particle physics in the past, discuss
      current capabilities, and explore future directions.

 

Current Publications and Presentations:

  • Derian Boer will present Harnessing the Power of Semi-Structured Knowledge and LLMs with Triplet-Based Prefiltering for Question Answering, which is a joint work with Fabian Koch and Stefan Kramer, at IJCLR'24.
  • Lukas Pensel will presen Neural RELAGGS, which is a joint work with Stefan Kamer, also at IJCLR'24.

 

Several of our group members take part in Discovery Science 2024:

  • Kirsten Köbschall will present Soft Hoeffding Tree: A Transparent and Differentiable Model on Data Streams, which is a joint work with Lisa Hartung, and Stefan Kramer.
  • Mattia Cerrato will present Science-Gym: A Simple Testbed for AI-drivenScientific Discovery, which is a joint work with Nicholas Schmitt, Lennart Baur, Edward Finkelstein, Selina Jukic, Lars Münzel, Felix Peter Paul, Pascal Pfannes, Benedikt Rohr, Julius Schellenberg, Philipp Wolf, and Stefan Kramer.
  • Jannis Brugger will present Residuals for Equation Discovery, which is a joint work with Viktor Pfanschilling, Mira Mezini and Stefan Kramer.

 

Recent Events:

  • At 01.08 was the kickoff meeting for our upcoming project: Medical AI combining Natural products and CEllular Imaging (MAINCE). MAINCE will use AI approaches to identify new and urgently needed therapeutics in immunology. Insights into the effects of these therapeutics, obtained through cutting-edge imaging techniques, will be combined with lab experiments using AI to accelerate and make drug development more efficient.
  • From 1-3 July we hosted the Third European Workshop on Algorithmic Fairness (EWAF’24) in Mainz, organized by Mattia Cerrato and Alesia Vallenas Coronel. The Workshop created a unique platform for researchers from academia and industry working on algorithmic fairness in the context of Europe’s legal and societal framework.
Posted on | Posted in Allgemein

KI-Lotse of the State of Rhineland-Palatinate

On August 22, 2024, Prof. Stefan Kramer was appointed "KI-Lotse" (roughly translated as "AI guide") for the life sciences by Alexander Schweitzer, the Prime Minister of the State of Rhineland-Palatinate. In his new role, Prof. Kramer will be advising partners from either AI or the life sciences regarding collaborations and the state government regarding the interface between the two disciplines. Congratulations and good luck with his plans to further strengthen the interaction between AI and the life sciences!

Press Release

Posted on | Posted in Allgemein

News and events in Q2/2024

Upcoming events:

  • The lecture series Ringvorlesung "Transparenz, Datensicherheit & Co.: Umsetzbare Anforderungen an KI-Systeme?" will be closed on 18th of June with a lecture by Prof. Stefan Kramer with a conclusion and an outlook, followed by a presentation of the practical perspective by Prof. Steffen Staab, University of Stuttgart.
  • On the 20th of June the summer party of institute of computer science will take place on the meadows in front of the institute building (Staudingerweg 9, 55128 Mainz).

Current activities:

  • We warmly welcome Federico Peiretti from the University of Turin, Italy, as a visting PhD student in the data mining group from 1st of May until 31st of August.

Current Presentations and Publications

  • The Ringvorlesung "Transparenz, Datensicherheit & Co.: Umsetzbare Anforderungen an KI-Systeme?" started this semester, organised by the TOPML project, and was opened on the 30th of April by a talk from Prof. Stefan Kramer. He highlighted the necessity of non-functional requirements for AI systems and the practical perspective as well as the research approach of TOPML. On the 14th of May Dr. Mattia Cerrato gave a lecture on various practical and theoretical examples to illustrate fairness in AI.
    In addition, the following lectures were presented::

  • Our paper Differentially Private Sum-Product Networks, which is a joint work of Xenia Heilmann, Mattia Cerrato and Ernst Althaus has been accepted at the 41st International Conference On Machine Learning (ICML), Vienna, Austria.

We enjoyed participating in the following past events:

  • On the 23rd of April, Prof. Stefan Kramer and Prof. Katharina von der Wense were the AI experts at the round table in the Kakadu bar, Mainz, following the science fiction opera Humanoid in the state theatre of Mainz, Germany. Humanoid narrates the relationship between humans and technology and the discussion afterwards went beyond questions such as "Can artificial intelligence develop consciousness?", "Where is the boundary between human and machine?" and "Do only humans have the right to a self-determined existence?".
  • Just like every year, we happily greeted curious and interested young girls for another exciting Girls' Day to the "Abenteuer Informatik: Informatik wörtlich begreifen" on the 25th of April. With organizational support from the Ada Lovelace project, we were able to offer an interesting insight into the world of computer science.
  • On the 29th and 30th of April the workshop "KI als Chance oder Risiko: von Grundlagen und Anwendungen bis hin zu Herausforderungen" was successfully held by PhD students from the TOPML project and organized with Q+. In this hands-on workshop practical examples were used to teach the basics of AI and discuss current chances and risks associated with its application.

Other highlights:

  • On the 3rd of May we welcomed Blake VanBerlo from the University of Waterloo, Canada, for great discussions.
  • We were pleased to welcome Dr. Stefan Kuhn, University of Tartu, Estonia, from June 3rd to 9th.
Posted on | Posted in Allgemein

Our paper "Differentially Private Sum-Product Networks" has been accepted at the International Conference On Machine Learning

Our paper Differentially Private Sum-Product Networks, which is a joint work of Xenia Heilmann, Mattia Cerrato and Ernst Althaus has been accepted at the 41st International Conference On Machine Learning (ICML), Vienna, Austria.

Abstract

Differentially private ML approaches seek to learn models which may be publicly released while guaranteeing that the input data is kept private. One issue with this construction is that further model releases based on the same training data (e.g. for a new task) incur a further privacy budget cost. Privacy-preserving synthetic data generation is one possible solution to this conundrum. However, models trained on synthetic private data struggle to approach the performance of private, ad-hoc models. In this paper, we present a novel method based on sum-product networks that is able to perform both privacy-preserving classification and privacy-preserving data generation with a single model. To the best of our knowledge, ours is the first approach that provides both discriminative and generative capabilities to differentially private ML. We show that our approach outperforms the state of the art in terms of stability (i.e. number of training runs required for convergence) and utility of the generated data.

News and events in Q1/2024

Upcoming Events: Third European Workshop on Algorithmic Fairness

  • From July 1 to July 3, 2024, we will be hosting the Third European Workshop on Algorithmic Fairness (EWAF’24) in Mainz, organized by members of the chair and associated members. The Workshop creates a unique platform for researchers from academia and industry working on algorithmic fairness in the context of Europe’s legal and societal framework. The objective is to promote discussions within an interdisciplinary setting. Interdisciplinary submissions dealing with the fields of computer science, law, sociology, philosophy, and EU specific topics related to algorithmic fairness are welcomed. Submission deadline is on March 15, 2024 on EasyChair. The workshop will feature keynotes by, amongst others, Virginia Dignum and Isabel Valera.

Current Publications and Presentations

Recent Events: curAMet and Retreat of Data Mining Chair

  • Focused on discussing latest progresses and future collaborations, all curATime partners, including members of the data mining group, met on February 6 for the first cluster conference at Rheingoldhalle Mainz. Posters were presented, and Prof. Kramer was invited to give a keynote.
  • The data mining chair spent a day with interesting talks at the Gutenberg Digital Hub in Mainz to exchange current work and ideas in the research group on February 9, 2024.

News and events in Q4/2023

Starting with this article, we are transitioning our public news reporting to a quarterly overview format. This editorial shift aims to provide a more extensive perspective on our activities. We continue referring to relevant articles for further details. Nevertheless, we encourage you to reach out to our group members for context, or your suggestions and in-depth-questions.

We are pleased to announce the following upcoming talk:

  • On 18th of Dezember, Prof. Stefan Kramer will lecture on AI for sustainability: “KI für Nachhaltigkeit: Ein kleiner, technologischer Beitrag". The lecture in German language is part of the series Voices for Climate, where Luisa Neubauer gave the first talk.

We enjoyed participating in the following inspiring past events:

  • Workshop on the Nobel Turing Grand Challenge: We discussed with the Nobel laureate Arieh Warshel and Turing prize winner Ed Feigenbaum and several other luminaries from AI and other fields how to advance the use of AI in research to tackle the Nobel Turing Grand Challenge: creating AI systems capable of making Nobel quality discoveries comparable to the best scientific breakthroughs.
  • On 30th of November during a symposium of about AI at the University of Mainz, Prof. Stefan Kramer discussedwith Prof. Christoph Bläsi and Prof. Paul Czodrowski in a panel about AI in science and society. In the afternoon, he offered an introduction into AI in scientific discovery. The total symposium took place on 22nd and 30th of November in German language, organzed by the KI-Projektbüro Rheinland-Pfalz under the heading of "KI@JGU".
  • Successful Ph.D. defense of Staffan Arvidsson McShane at Uppsala University with the title Confidence Predictions in Pharmaceutical Science. Supervisor Prof. Ola Spjuth summarizes: “The thesis applied machine learning and conformal prediction in various settings for prediction of drug metabolism, when transitioning between biological assays, and for monitoring disease progression”.
  • Successful Ph.D. defense of Shuyi Yang at the University of Turin on inductive, fair and private semi-supervised learning.
  • Talk JuGSo: Künstliche Intelligenz – Segen oder Fluch? by Prof. Stefan Kramer hosted by the Social Democratic Party Wiesbaden on November, 14th
  • We organized the first interdisciplinary workshop on ML, law and society at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023) from 18th to 22nd of September.

Furthermore, funding for two projects with our contributions has been granted by Carl Zeiss Stiftung recently:

  • MAINCE: Medical AI combining Natural products and CEllular Imaging. MAINCE will use AI approaches to identify new and much-needed therapeutics in immunology. Clues to the effect of therapeutics through state-of-the-art imaging techniques will be linked by (amongst others) generative AI to laboratory experiments to accelerate drug development and make it more efficient. Linking natural products with cell painting holds the promise for the discovery of novel drugs, powered by cutting edge AI.
  • Multi-dimensionAI: linking scales of information to improve care for patients with heart failure. The project team is researching the treatment of heart failure using a combination of innovative AI approaches and robotics. Comprehensive health data will be used to train an AI that will identify new causal factors for heart failure and incorporate them into the therapy.

Other highlights:

Posted on | Posted in Allgemein

Our paper "Discriminative machine learning for maximal representative subsampling" has been accepted at Scientific Reports

Our paper Discriminative machine learning for maximal representative subsampling, which is a joint work of Tony Hauptmann, Sophie Fellenz, Laksan Nathan, Oliver Tüscher and Stefan Kramer has been accepted at Scientific Reports.

Abstract

Biased population samples pose a prevalent problem in the social sciences. Therefore, we present two novel methods that are based on positive-unlabeled learning to mitigate bias. Both methods leverage auxiliary information from a representative data set and train machine learning classifiers to determine the sample weights. The first method, named maximum representative subsampling (MRS), uses a classifier to iteratively remove instances, by assigning a sample weight of 0, from the biased data set until it aligns with the representative one. The second method is a variant of MRS – Soft-MRS – that iteratively adapts sample weights instead of removing samples completely. To assess the effectiveness of our approach, we induced artificial bias in a public census data set and examined the corrected estimates. We compare the performance of our methods against existing techniques, evaluating the ability of sample weights created with Soft-MRS or MRS to minimize differences and improve downstream classification tasks. Lastly, we demonstrate the applicability of the proposed methods in a real-world study of resilience research, exploring the influence of resilience on voting behavior. Through our work, we address the issue of bias in social science, amongst others, and provide a versatile methodology for bias reduction based on machine learning. Based on our experiments, we recommend to use MRS for downstream classification tasks and Soft-MRS for downstream tasks where the relative bias of the dependent variable is relevant.

Posted on | Posted in Allgemein

Our paper "Classifying Aircraft Categories from Magnetometry Data Using a Hypothesis-based Multi-Task Framework" has been accepted at the International Conference on Prestigious Applications of Intelligent Systems

Our paper "Classifying Aircraft Categories from Magnetometry Data Using a Hypothesis-based Multi-Task Framework", which is a joint work of Julian Vexler and Stefan Kramer, has been accepted at the 12th International Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 (https://ecai2023.eu/ECAI2023-3), as a sister conference to ECAI 2023, the 26th European Conference on Artificial Intelligence. The paper is a result of an on-going project about the air-side integration of magnetometers for object classification.

Abstract

Airport traffic surveillance requires reliable safety systems to prevent accidents in safety-critical areas. This paper examines airport aprons, where existing holding point protection systems have shown that they are sometimes not able to prevent accidents. One possible solution to this problem is the use of innovative sensor technology such as magnetometers. These sensors can be used to measure the distortion of the earth's magnetic field by metallic objects. The main objective is to identify the geometrical pattern of a passing object by fusing coherent events, and classify it into a category based on its size. We propose a hypotheses-based multi-task framework for the classification of aircraft by making use of the estimated motion behaviour of a passing object. The framework includes statistical components, domain knowledge, and artificial intelligence solutions to infer the geometrical pattern and motion vector of an object from a predefined set of possible hypotheses. In future work, we aim to optimize the framework using synthetic and real-world data to increase its robustness and generalization ability to other airports.

Posted on | Posted in Allgemein

We welcome our new group member Derian Boer!

Derian Boer has joined the Data Mining group and will be working in the Cluster for Atherothrombosis and Individualized Medicine (curATime) a future cluster funded by the BMBF, where powerful players from science and industry have joined forces to develop tailored treatment and prevention concepts for cardiovascular diseases and their clinical application.

Posted on | Posted in Allgemein