Allgemein

News and events in Q4/2023

Starting with this article, we are transitioning our public news reporting to a quarterly overview format. This editorial shift aims to provide a more extensive perspective on our activities. We continue referring to relevant articles for further details. Nevertheless, we encourage you to reach out to our group members for context, or your suggestions and in-depth-questions.

We are pleased to announce the following upcoming talk:

  • On 18th of Dezember, Prof. Stefan Kramer will lecture on AI for sustainability: “KI für Nachhaltigkeit: Ein kleiner, technologischer Beitrag". The lecture in German language is part of the series Voices for Climate, where Luisa Neubauer gave the first talk.

We enjoyed participating in the following inspiring past events:

  • Workshop on the Nobel Turing Grand Challenge: We discussed with the Nobel laureate Arieh Warshel and Turing prize winner Ed Feigenbaum and several other luminaries from AI and other fields how to advance the use of AI in research to tackle the Nobel Turing Grand Challenge: creating AI systems capable of making Nobel quality discoveries comparable to the best scientific breakthroughs.
  • On 30th of November during a symposium of about AI at the University of Mainz, Prof. Stefan Kramer discussedwith Prof. Christoph Bläsi and Prof. Paul Czodrowski in a panel about AI in science and society. In the afternoon, he offered an introduction into AI in scientific discovery. The total symposium took place on 22nd and 30th of November in German language, organzed by the KI-Projektbüro Rheinland-Pfalz under the heading of "KI@JGU".
  • Successful Ph.D. defense of Staffan Arvidsson McShane at Uppsala University with the title Confidence Predictions in Pharmaceutical Science. Supervisor Prof. Ola Spjuth summarizes: “The thesis applied machine learning and conformal prediction in various settings for prediction of drug metabolism, when transitioning between biological assays, and for monitoring disease progression”.
  • Successful Ph.D. defense of Shuyi Yang at the University of Turin on inductive, fair and private semi-supervised learning.
  • Talk JuGSo: Künstliche Intelligenz – Segen oder Fluch? by Prof. Stefan Kramer hosted by the Social Democratic Party Wiesbaden on November, 14th
  • We organized the first interdisciplinary workshop on ML, law and society at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023) from 18th to 22nd of September.

Furthermore, funding for two projects with our contributions has been granted by Carl Zeiss Stiftung recently:

  • MAINCE: Medical AI combining Natural products and CEllular Imaging. MAINCE will use AI approaches to identify new and much-needed therapeutics in immunology. Clues to the effect of therapeutics through state-of-the-art imaging techniques will be linked by (amongst others) generative AI to laboratory experiments to accelerate drug development and make it more efficient. Linking natural products with cell painting holds the promise for the discovery of novel drugs, powered by cutting edge AI.
  • Multi-dimensionAI: linking scales of information to improve care for patients with heart failure. The project team is researching the treatment of heart failure using a combination of innovative AI approaches and robotics. Comprehensive health data will be used to train an AI that will identify new causal factors for heart failure and incorporate them into the therapy.

Other highlights:

Posted on | Posted in Allgemein

Our paper "Discriminative machine learning for maximal representative subsampling" has been accepted at Scientific Reports

Our paper Discriminative machine learning for maximal representative subsampling, which is a joint work of Tony Hauptmann, Sophie Fellenz, Laksan Nathan, Oliver Tüscher and Stefan Kramer has been accepted at Scientific Reports.

Abstract

Biased population samples pose a prevalent problem in the social sciences. Therefore, we present two novel methods that are based on positive-unlabeled learning to mitigate bias. Both methods leverage auxiliary information from a representative data set and train machine learning classifiers to determine the sample weights. The first method, named maximum representative subsampling (MRS), uses a classifier to iteratively remove instances, by assigning a sample weight of 0, from the biased data set until it aligns with the representative one. The second method is a variant of MRS – Soft-MRS – that iteratively adapts sample weights instead of removing samples completely. To assess the effectiveness of our approach, we induced artificial bias in a public census data set and examined the corrected estimates. We compare the performance of our methods against existing techniques, evaluating the ability of sample weights created with Soft-MRS or MRS to minimize differences and improve downstream classification tasks. Lastly, we demonstrate the applicability of the proposed methods in a real-world study of resilience research, exploring the influence of resilience on voting behavior. Through our work, we address the issue of bias in social science, amongst others, and provide a versatile methodology for bias reduction based on machine learning. Based on our experiments, we recommend to use MRS for downstream classification tasks and Soft-MRS for downstream tasks where the relative bias of the dependent variable is relevant.

Posted on | Posted in Allgemein

Our paper "Classifying Aircraft Categories from Magnetometry Data Using a Hypothesis-based Multi-Task Framework" has been accepted at the International Conference on Prestigious Applications of Intelligent Systems

Our paper "Classifying Aircraft Categories from Magnetometry Data Using a Hypothesis-based Multi-Task Framework", which is a joint work of Julian Vexler and Stefan Kramer, has been accepted at the 12th International Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 (https://ecai2023.eu/ECAI2023-3), as a sister conference to ECAI 2023, the 26th European Conference on Artificial Intelligence. The paper is a result of an on-going project about the air-side integration of magnetometers for object classification.

Abstract

Airport traffic surveillance requires reliable safety systems to prevent accidents in safety-critical areas. This paper examines airport aprons, where existing holding point protection systems have shown that they are sometimes not able to prevent accidents. One possible solution to this problem is the use of innovative sensor technology such as magnetometers. These sensors can be used to measure the distortion of the earth's magnetic field by metallic objects. The main objective is to identify the geometrical pattern of a passing object by fusing coherent events, and classify it into a category based on its size. We propose a hypotheses-based multi-task framework for the classification of aircraft by making use of the estimated motion behaviour of a passing object. The framework includes statistical components, domain knowledge, and artificial intelligence solutions to infer the geometrical pattern and motion vector of an object from a predefined set of possible hypotheses. In future work, we aim to optimize the framework using synthetic and real-world data to increase its robustness and generalization ability to other airports.

Posted on | Posted in Allgemein

We welcome our new group member Derian Boer!

Derian Boer has joined the Data Mining group and will be working in the Cluster for Atherothrombosis and Individualized Medicine (curATime) a future cluster funded by the BMBF, where powerful players from science and industry have joined forces to develop tailored treatment and prevention concepts for cardiovascular diseases and their clinical application.

Posted on | Posted in Allgemein

We welcome our new group member Håkan Lane!

Håkan Lane has joined the Data Mining group and will be working in the Cluster for Atherothrombosis and Individualized Medicine (curATime), a future cluster funded by the BMBF, where powerful players from science and industry have joined forces to develop tailored treatment and prevention concepts for cardiovascular diseases and their clinical application.

Posted on | Posted in Allgemein

Our paper "Identifying Aircraft Motions and Patterns from Magnetometry Data Using a Knowledge-Based Multi-Fusion Approach" has been accepted at the International Conference on Information Fusion (FUSION)

Our paper "Identifying Aircraft Motions and Patterns from Magnetometry Data Using a Knowledge-Based Multi-Fusion Approach", which is a joint work of Julian Vexler and Stefan Kramer, has been accepted at the international conference on information fusion (https://fusion2023.org/). The paper is a result of an on-going project about the air-side integration of magnetometers for object detection.

Abstract

In aviation there are many safety-critical domains where reliable safety systems are essential to prevent any kind of hazard. This paper focuses on airport aprons, where currently used holding point protection systems have shown to be not faultless, sometimes leading to avoidable accidents. One way to avoid such accidents is by means of innovative sensor technology, in our case, magnetometers, i.e. sensors measuring the distortion of the earth’s magnetic field by metallic objects. The main goal is to use the magnetometry data to detect passing aircraft and to capture their geometrical pattern as well as to estimate their motion vector. Therefore, we present a spatio-temporal cluster fusion and an event fusion algorithm. The cluster fusion can be applied as a post-processing step to any spatio-temporal clustering method and is able to more accurately represent aircraft patterns by integrating expert knowledge into the fusion process. In this context, we present a spatio-temporal cluster tree representation for a fast and accurate estimation of the motion vector. Finally, the data-driven event fusion is able to separate detected aircraft crossings into separate events by employing domain-knowledge. In future work, we aim to come up with a framework making use of the cluster results and estimated motion vector to classify and infer the position of an aircraft, before this is deployed as a real-time application.

Posted on | Posted in Allgemein

Our paper "Four-dimensional trapped ion mobility spectrometry lipidomics for high throughput clinical profiling of human blood samples" has been accepted at Nature Communications

Our paper "Four-dimensional trapped ion mobility spectrometry lipidomics for high throughput clinical profiling of human blood samples", which is a joint work of Raissa Lerner, Dhanwin Baker, Claudia Schwitter, Sarah Neuhaus, Tony Hauptmann, Julia M. Post, Stefan Kramer & Laura Bindila, has been accepted at Nature Communications.

Abstract

Lipidomics encompassing automated lipid extraction, a four-dimensional (4D) feature selection strategy for confident lipid annotation as well as reproducible and cross-validated quantification can expedite clinical profiling. Here, we determine 4D descriptors (mass to charge, retention time, collision cross section, and fragmentation spectra) of 200 lipid standards and 493 lipids from reference plasma via trapped ion mobility mass spectrometry to enable the implementation of stringent criteria for lipid annotation. We use 4D lipidomics to confidently annotate 370 lipids in reference plasma samples and 364 lipids in serum samples, and reproducibly quantify 359 lipids using level-3 internal standards. We show the utility of our 4D lipidomics workflow for high-throughput applications by reliable profiling of intra-individual lipidome phenotypes in plasma, serum, whole blood, venous and finger-prick dried blood spots.

Posted on | Posted in Allgemein

Our paper "A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction" has been accepted at BMC Bioinformatics

Our paper "A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction", which is a joint work of Tony Hauptmann and Stefan Kramer, has been accepted at BMC Bioinformatics.

 

Abstract:

Background

Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases.

 

Results

We developed a comparison framework that trains and optimizes multi-omics integration methods under equal conditions. We incorporated early integration, PCA and four recently published deep learning methods: MOLI, Super.FELT, OmiEmbed, and MOMA. Further, we devised a novel method, Omics Stacking, that combines the advantages of intermediate and late integration. Experiments were conducted on a public drug response data set with multiple omics data (somatic point mutations, somatic copy number profiles and gene expression profiles) that was obtained from cell lines, patient-derived xenografts, and patient samples. Our experiments confirmed that early integration has the lowest predictive performance. Overall, architectures that integrate triplet loss achieved the best results. Statistical differences can, overall, rarely be observed, however, in terms of the average ranks of methods, Super.FELT is consistently performing best in a cross-validation setting and Omics Stacking best in an external test set setting.

 

Conclusions

We recommend researchers to follow fair comparison protocols, as suggested in the paper. When faced with a new data set, Super.FELT is a good option in the cross-validation setting as well as Omics Stacking in the external test set setting. Statistical significances are hardly observable, despite trends in the algorithms’ rankings. Future work on refined methods for transfer learning tailored for this domain may improve the situation for external test sets. The source code of all experiments is available at Github.

Posted on | Posted in Allgemein

Our paper "Invariant Representations with Stochastically Quantized Neural Networks" has been accepted at AAAI 2023

Our paper "Invariant Representations with Stochastically Quantized Neural Networks", which is a joint work of Mattia Cerrato, Marius Köppel, Roberto Esposito and Stefan Kramer has been accepted at AAAI 2023.

 

Abstract:

Representation learning algorithms offer the opportunity to learn invariant representations of the input data with regard to nuisance factors. Many authors have leveraged such strategies to learn fair representations, i.e., vectors where information about sensitive attributes is removed. These methods are attractive as they may be interpreted as minimizing the mutual information between a neural layer's activations and a sensitive attribute. However, the theoretical grounding of such methods relies either on the computation of infinitely accurate adversaries or on minimizing a variational upper bound of a mutual information estimate. In this paper, we propose a methodology for direct computation of the mutual information between a neural layer and a sensitive attribute. We employ stochastically-activated binary neural networks, which lets us treat neurons as random variables. We are then able to compute (not bound) the mutual information between a layer and a sensitive attribute and use this information as a regularization factor during gradient descent. We show that this method compares favorably with the state of the art in fair representation learning and that the learned representations display a higher level of invariance compared to full-precision neural networks.

Posted on | Posted in Allgemein