Allgemein

Our paper "Deep Unsupervised Identification of Selected Genes and SNPs in Pool-Seq Data from Evolving Populations" has been accepted as poster presentation at RECOMB-Genetics’22

Our paper "Deep Unsupervised Identification of Selected Genes and SNPs in Pool-Seq Data from Evolving Populations", which was a joint work of Julia Siekiera and Stefan Kramer has been accepted as poster presentation at RECOMB 2022-Genetics.

 

Abstract:

The exploration of selected single nucleotide polymorphisms (SNPs) to identify genetic diversity between populations under selection pressure is a fundamental task in population genetics. As underlying sequence reads and their alignment are error-prone and univariate statistical solutions like the Cochran-Mantel-Haenszel test (CMH) only take individual positions of the genome into account, the identification of selected SNPs remains a challenging process. Deep learning models, by contrast, are able to consider large input areas to integrate the decision of individual positions in the context of (hidden) neighboring patterns. We suggest an unsupervised deep learning pipeline to detect selected SNPs or genes between different types of population pairs by the application of both active learning and explainable AI methods. To provide a solution for various experimental designs, the effectiveness of direct genomic population comparison and the integration of drift simulation is investigated. In addition, we demonstrate how the extension of an autoencoder architecture can support the mapping of the genotype into a hidden representation upon which optimized selection detection is possible. The performance of the proposed method configurations is investigated on different simulated sequencing pools of individuals (Pool-Seq)datasets of Drosophila melanogaster and compared to an univariate baseline. The evaluation demonstrates that deep neural networks offer the potential to recognize hidden patterns in the allele frequencies of evolved populations and to enhance the information given by univariate statistics.

Our paper "Deep neural networks to recover unknown physical parameters from oscillating time series" has been accepted at PLOS ONE

Our paper "Deep neural networks to recover unknown physical parameters from oscillating time series"  (DOI) which was a joint work of Antoine Garcon, Julian Vexler, Dmitry Budker and Stefan Kramer was accepted at PLOS ONE.

 

Abstract:

Deep neural networks are widely used in pattern-recognition tasks for which a human-comprehensible, quantitative description of the data-generating process, cannot be obtained. While doing so, neural networks often produce an abstract (entangled and non-interpretable) representation of the data-generating process. This may be one of the reasons why neural networks are not yet used extensively in physics-experiment signal processing: physicists generally require their analyses to yield quantitative information about the system they study. In this article we use a deep neural network to disentangle components of oscillating time series. To this aim, we design and train the neural network on synthetic oscillating time series to perform two tasks: a regression of the signal latent parameters and signal denoising by an Autoencoder-like architecture. We show that the regression and denoising performance is similar to those of least-square curve fittings with true latent-parameters initial guesses, in spite of the neural network needing no initial guesses at all. We then explore various applications in which we believe our architecture could prove useful for time-series processing, when prior knowledge is incomplete. As an example, we employ the neural network as a preprocessing tool to inform the least-square fits when initial guesses are unknown. Moreover, we show that the regression can be performed on some latent parameters, while ignoring the existence of others. Because the Autoencoder needs no prior information about the physical model, the remaining unknown latent parameters can still be captured, thus making use of partial prior knowledge, while leaving space for data exploration and discoveries.

Our paper "fair pairwise learning to rank" has been accepted at IEEE International Conference on Data Science and Advanced Analytics (DSAA)

Our paper "fair pairwise learning to rank", which was a joint work of Mattia Cerrato, Marius Köppel, Alexander Segner, Roberto Esposito, and Stefan Kramer, was accepted at IEEE International Conference on Data Science and Advanced Analytics (DSAA).

Abstract: Ranking algorithms based on Neural Networks have been a topic of recent research. Ranking is employed in everyday applications like product recommendations, search results, or even in finding good candidates for hiring. However, Neural Networks are mostly opaque tools, and it is hard to evaluate why
a specific candidate, for instance, was not considered. Therefore, for neural-based ranking methods to be trustworthy it is crucial to guarantee that the outcome is fair and that the decisions are not discriminating people according to sensitive attributes such as gender, sexual orientation, or ethnicity.
In this work we present a family of fair pairwise learning to rank approaches based on Neural Networks, which are able to produce balanced outcomes for underprivileged groups and, at the same time, build fair representations of data, i.e. new vectors having no correlation with regard to a sensitive attribute. We
compare our approaches to recent work dealing with fair ranking and evaluate them using both relevance and fairness metrics. Our results show that the introduced fair pairwise ranking methods compare favorably to other methods when considering the fairness/relevance trade-off.

Posted on | Posted in Allgemein

Talk by Dr. Claudia Schon

Stereotypes in Artificial Intelligence

Today, Artificial Intelligence systems are used in many areas and support among other things decision-making processes (e.g. in checking creditworthiness or assessing the probability of a criminal returning to crime). It is precisely these systems that one wishes to be free of prejudice. Unfortunately, however, this does not always correspond to reality. But to what extent do AI systems show prejudice in general and gender bias in particular? The talk focuses on the area of Commonsense Reasoning, which finds an application in the field of automatic assistants such as Siri and Alexa, and considers the question of how prejudice can be measured in this area.

 

Dr. Claudia Schon is a Klara-Marie-Faßbinder-Gastprofessor at TH Bingen.

The talk will take place on the 31.01.20 at 14:00 in room 05-514.

Posted on | Posted in Allgemein

Activities of the Data Mining Group at ECML/PKDD 2019

The Data Mining Group actively participated in a number of activities at the ECML/PKDD 2019 conference.

The group members presented research papers, posters and also organized two workshops. The workshop programs are available at the following links.

  • DeCoDeML (Deep Continuous-Discrete Machine Learning)
  • AIMLAI & XKDD (Advances in Interpretable Machine Learning and Artificial Intelligence & eXplainable Knowledge Discovery in Data mining)

The research paper "Pairwise Learning to Rank by Neural Networks Revisited: Reconstruction, Theoretical Analysis and Practical Performance: Marius Köppel*, Alexander Segner*, Martin Wagener*, Lukas Pensel*, Andreas Karwath^, Stefan Kramer*" was presented at the Ranking session.

The following papers were presented at the DeCoDeML and SoGood workshops.

DeCoDeML Spotlight Talk 1: Zahra Ahmadi, Sina Malakouti and Stefan Kramer. Deep Tree Networks: A New Symbolic Deep Architecture

DeCoDeML Spotlight Talk 3: Sophie Burkhardt, Nicolas Wagner, Johannes Fürnkranz and Stefan Kramer. Extracting Rules with Adaptable Complexity from Neural Networks using K-Term DNF Optimization

DeCoDeML Spotlight Talk 4: Nicolas Wagner, Sophie Burkhardt, Stefan Kramer. A Deep Convolutional DNF Learner

SoGood Talk: Lukas Pensel and Stefan Kramer. Forecast of Study Success in the STEM Disciplines Based Solely on Academic Record

 
The DeCoDeML workshop was organized for the first time and participation was well beyond the expectations of the organizing committee.
 

* = Johannes Gutenberg-Universität Mainz
^ = University of Birmingham

Paper accepted at Pacific Symposium for Biocomputing (PSB)

Our paper "Identifying drug side effects from social media using active learning and crowd sourcing", which was a joint work of Sophie Burkhardt, Julia Siekiera, Josua Glodde, Miguel A. Andrade-Navarro, and Stefan Kramer was accepted at the Pacific Symposium for Biocomputing (PSB) conference as an oral presentation.

Posted on | Posted in Allgemein

Paper accepted at JMLR

Our paper "Decoupling Sparsity and Smoothness in the DirichletVariational Autoencoder Topic Model" was accepted at the journal for machine learning research (JMLR) pdf code

Posted on | Posted in Allgemein