Externally Funded Projects
This page lists summaries of current and past projects funded by third parties of which I was Principle Investigator (PI). Overall, I raised about 400,000 Euro in third party funds so far.
DEFECTS – Comparable and Externally Valid Software Defect Prediction (2018-ongoing)
The comparability and reproducibility of empirical software engineering research is, for the most part, an open problem. This statement holds true for the field of software defect prediction. Current research shows that this leads to actual problems regarding the external validity of defect prediction research. Multiple replications conducted by different groups of researchers led to different findings than prior research. Moreover, problems with the currently used data sets were discovered and it was demonstrated that these problems may change conclusions. Thus, defect prediction research faces a replication crisis if these problems are ignored.
Within this project, we plan to create a solid foundation for comparable and externally valid defect prediction research. Our approach rests on three pillars. The first pillar is the quality of the data we use for defect prediction experiments. The current studies on data quality do not cover the impact of mislabeled data. This kind of noise affects not only the creation of defect prediction models, but also their evaluation. We will statistically evaluate the noise in current data sets. Based on our findings, we will improve the state of the art of defect labeling and generate large data set with less noise. The quality of our data will be statistically validated. The collected body will be larger than the available defect prediction data sets and thereby facilitate a better generalizability and external validity of results.
The second pillar is the replication of the current state of the art. Since prior replications were already contradictory to the original experiments, we believe that a broader replication effort is necessary. Current replications consider only parts of the state of the art, e.g., classifier impact or cross-project defect prediction. Most of the state of the art still was never replicated and diligently compared to other approaches or naïve baselines. Most experiments only used small data sets, which is a key factor for the problems with external validity. We will conduct a conceptual replication of the state of the art of defect prediction. Through this, we will improve the external validity of the defect prediction state of the art and lay the groundwork for a better external validity of future work.
The third pillar are guidelines for defect prediction research. In case we cannot get researchers to avoid anti-patterns that led to bad validity of results, our efforts to combat the replication crisis of defect prediction research will only have a short-term effect. To make our results sustainable, we will work together with the defect prediction community to define guidelines that allow researchers to conduct their defect prediction experiments in such a way that we hopefully never face such problems with replicability again.
GAIUS – Maintenance activities for the sustainability of AUGUSTUS (2018-ongoing)
AUGUSTUS is a tool for the structural annotation of genes in genomic sequences. Within this joint project with Prof. Dr. Mario Stanke from the University of Greifswald, we will work on the maintenance of AUGUSTUS. While the prediction methods of AUGUSTUS were advanced over the years, the maintainability and sustainability. This is highlighted by usability issues, but also general issues within the codebase. This project will focus solely on the maintenance of AUGUSTUS to improve upon this, i.e., improve the usability of AUGUSTUS, as well as the maintainability of the codebase.
Pilot Study: Defect Prediction at Continental (April 2017-December 2017)
Funded by the Continental GmbH
SmartSHARK is a versatile tool for software repository mining. Within this project, we performed a pilot study in cooperation with the Continental GmbH to assess potential benefits of using SmartSHARK for defect prediction of C and C++ software developed in-house at Continental.