Our paper "Invariant Representations with Stochastically Quantized Neural Networks", which is a joint work of Mattia Cerrato, Marius Köppel, Roberto Esposito and Stefan Kramer has been accepted at AAAI 2023.
Representation learning algorithms offer the opportunity to learn invariant representations of the input data with regard to nuisance factors. Many authors have leveraged such strategies to learn fair representations, i.e., vectors where information about sensitive attributes is removed. These methods are attractive as they may be interpreted as minimizing the mutual information between a neural layer's activations and a sensitive attribute. However, the theoretical grounding of such methods relies either on the computation of infinitely accurate adversaries or on minimizing a variational upper bound of a mutual information estimate. In this paper, we propose a methodology for direct computation of the mutual information between a neural layer and a sensitive attribute. We employ stochastically-activated binary neural networks, which lets us treat neurons as random variables. We are then able to compute (not bound) the mutual information between a layer and a sensitive attribute and use this information as a regularization factor during gradient descent. We show that this method compares favorably with the state of the art in fair representation learning and that the learned representations display a higher level of invariance compared to full-precision neural networks.
Kirsten Köbschall and Kiara Stempel have joined the Data Mining group and will be working on the project Trading off Non-Functional Properties of Machine Learning (TOPML).
The TOPML project ("Trading off Non-Functional Properties of Machine Learning") funded by the Carl-Zeiss-Stiftung, started on the 1st of July. An online kick-off meeting has taken place on the 6th of July, hosted by the spokesperson Prof. Stefan Kramer.
Prof. Stefan Kramer gave an informal presentation which touched upon the various themes of TOP-ML (fairness, interpretability, resource-efficiency, privacy, law and ethics) and described opportunities for collaboration. In particular, a series of lunch meetings is currently being planned. As the project members have different backgrounds, information sharing and regular informal meetings will help foster collaboration and multi-disciplinarity.
New PhD Students and researchers are currently being hired, with the first ones starting in the month of July. More hirings are currently planned for early fall.
The findings from the various themes are to be used within JGU itself, but also transferred to industrial practice. An AI Lab is to be located at Mainz University of Applied Sciences due to its proximity to regional and national industry. The developed methods in the planned research project will be implemented as software applications in the lab. They will serve to test and validate the developed methods. In addition, the implementation of the results as executable software applications ensures a successful transfer to science and industry
The University Medical Center Mainz also participated in the meeting to explore ways of joining the TOPML project.
Our preprint "Fair Interpretable Representation Learning with Correction Vectors" (by Cerrato, Coronel, Koppel, Segner, Esposito and Kramer) has been featured by the Montreal AI Ethics Institute. You may read the research summary here.
Our paper "Learning to Rank Higgs Boson Candidates", which is a joint work of Marius Köppel, Alexander Segner, Martin Wagener, Lukas Pensel, Andreas Karwath, Christian Schmitt and Stefan Kramer has been accepted at Nature Scientific Reports.
In the extensive search for new physics, the precise measurement of the Higgs boson continues to play an important role. To this end, machine learning techniques have been recently applied to processes like the Higgs production via vector-boson fusion. In this paper, we propose to use algorithms for learning to rank, i.e., to rank events into a sorting order, first signal, then background, instead of algorithms for the classification into two classes, for this task. The fact that training is then performed on pairwise comparisons of signal and background events can effectively increase the amount of training data due to the quadratic number of possible combinations. This makes it robust to unbalanced data set scenarios and can improve the overall performance compared to pointwise models like the state-of-the-art boosted decision tree approach. In this work we compare our pairwise neural network algorithm, which is a combination of a convolutional neural network and the DirectRanker, with convolutional neural networks, multilayer perceptrons or boosted decision trees, which are commonly used algorithms in multiple Higgs production channels. Furthermore, we use so-called transfer learning techniques to improve overall performance on different data types.
Our short paper "Ranking Creative Language Characteristics in Small Data Scenarios", which is a joint work of Julia Siekiera, Marius Köppel, Edwin Simpson, Kevin Stowe, Iryna Gurevych, Stefan Kramer has been accepted at ICCC'22.
The ability to rank creative natural language provides an important general tool for downstream language understanding and generation. However, current deep ranking models require substantial amounts of labeled data that are difficult and expensive to obtain for new domains, languages and creative characteristics. A recent neural approach, DirectRanker, reduces the amount of training data needed but has not previously been used to rank creative text. We therefore adapt DirectRanker to provide a new deep model for ranking creative language with small numbers of training instances, and compare it with a Bayesian approach, Gaussian process preference learning (GPPL), which was previously shown to work well with sparse data. Our experiments with short creative language texts show the effectiveness of DirectRanker even with small training datasets. Combining DirectRanker with GPPL outperforms the previous state of the art on humor and metaphor novelty tasks, increasing Spearman's ρ by 25% and 29% on average. Furthermore, we provide a possible application to validate jokes in the process of creativity generation.
Our paper "Deep Unsupervised Identification of Selected Genes and SNPs in Pool-Seq Data from Evolving Populations", which was a joint work of Julia Siekiera and Stefan Kramer has been accepted as poster presentation at RECOMB 2022-Genetics.
The exploration of selected single nucleotide polymorphisms (SNPs) to identify genetic diversity between populations under selection pressure is a fundamental task in population genetics. As underlying sequence reads and their alignment are error-prone and univariate statistical solutions like the Cochran-Mantel-Haenszel test (CMH) only take individual positions of the genome into account, the identification of selected SNPs remains a challenging process. Deep learning models, by contrast, are able to consider large input areas to integrate the decision of individual positions in the context of (hidden) neighboring patterns. We suggest an unsupervised deep learning pipeline to detect selected SNPs or genes between different types of population pairs by the application of both active learning and explainable AI methods. To provide a solution for various experimental designs, the effectiveness of direct genomic population comparison and the integration of drift simulation is investigated. In addition, we demonstrate how the extension of an autoencoder architecture can support the mapping of the genotype into a hidden representation upon which optimized selection detection is possible. The performance of the proposed method configurations is investigated on different simulated sequencing pools of individuals (Pool-Seq)datasets of Drosophila melanogaster and compared to an univariate baseline. The evaluation demonstrates that deep neural networks offer the potential to recognize hidden patterns in the allele frequencies of evolved populations and to enhance the information given by univariate statistics.
Our paper "Deep neural networks to recover unknown physical parameters from oscillating time series" (DOI) which was a joint work of Antoine Garcon, Julian Vexler, Dmitry Budker and Stefan Kramer was accepted at PLOS ONE.
Deep neural networks are widely used in pattern-recognition tasks for which a human-comprehensible, quantitative description of the data-generating process, cannot be obtained. While doing so, neural networks often produce an abstract (entangled and non-interpretable) representation of the data-generating process. This may be one of the reasons why neural networks are not yet used extensively in physics-experiment signal processing: physicists generally require their analyses to yield quantitative information about the system they study. In this article we use a deep neural network to disentangle components of oscillating time series. To this aim, we design and train the neural network on synthetic oscillating time series to perform two tasks: a regression of the signal latent parameters and signal denoising by an Autoencoder-like architecture. We show that the regression and denoising performance is similar to those of least-square curve fittings with true latent-parameters initial guesses, in spite of the neural network needing no initial guesses at all. We then explore various applications in which we believe our architecture could prove useful for time-series processing, when prior knowledge is incomplete. As an example, we employ the neural network as a preprocessing tool to inform the least-square fits when initial guesses are unknown. Moreover, we show that the regression can be performed on some latent parameters, while ignoring the existence of others. Because the Autoencoder needs no prior information about the physical model, the remaining unknown latent parameters can still be captured, thus making use of partial prior knowledge, while leaving space for data exploration and discoveries.
Our paper "fair pairwise learning to rank", which was a joint work of Mattia Cerrato, Marius Köppel, Alexander Segner, Roberto Esposito, and Stefan Kramer, was accepted at IEEE International Conference on Data Science and Advanced Analytics (DSAA).
Abstract: Ranking algorithms based on Neural Networks have been a topic of recent research. Ranking is employed in everyday applications like product recommendations, search results, or even in finding good candidates for hiring. However, Neural Networks are mostly opaque tools, and it is hard to evaluate why
a specific candidate, for instance, was not considered. Therefore, for neural-based ranking methods to be trustworthy it is crucial to guarantee that the outcome is fair and that the decisions are not discriminating people according to sensitive attributes such as gender, sexual orientation, or ethnicity.
In this work we present a family of fair pairwise learning to rank approaches based on Neural Networks, which are able to produce balanced outcomes for underprivileged groups and, at the same time, build fair representations of data, i.e. new vectors having no correlation with regard to a sensitive attribute. We
compare our approaches to recent work dealing with fair ranking and evaluate them using both relevance and fairness metrics. Our results show that the introduced fair pairwise ranking methods compare favorably to other methods when considering the fairness/relevance trade-off.
Dr. Mohammad Sadeq Dousti has joined the Data Mining group and will be mainly working on privacy-preserving data mining. In addition to his research, he will also support our group with the teaching terms.