We had a very interesting discussion on Twitter the other day regarding artifact evaluations triggered by this blog post from Moritz Beller about his experiences with artifact evaluations.
At the end of this discussion, I had a random thought. Why do we have artifact tracks at all? Why don’t we organize the evaluation of artifacts in a journal style manner? Since then, I thought a bit more about this, and think this is a good solution for many problems. With this blog post, I want to first outline the concept of the “Artifact Journal” (AJ) and then describe how this could solve current problems with artifact evaluation and what potential drawbacks could be.
How would the AJ work?
The AJ would be organized similar to the Journal of Open Source Software (JOSS): A very brief summary of the artifact would be submitted together with the artifact. There would be guidelines about what must be contained (which binaries, sources and data are required, what must be documented, how virtualization should/must be used). The authors should also describe for which badges they apply. Artifacts can be submitted, as soon as a paper is accepted at any venue that is affiliated with the AJ. For example, if the AJ is affiliated with the ICSE, any paper accepted at ICSE could be submitted. If the AJ is affiliated with ACM SIGSOFT, any paper accepted at a conference organized by ACM SIGSOFT could be submitted. The same holds for IEEE, other publishers, and also journals.
The reviews of the artifact would follow strict guidelines of what should be checked. All reviews would be open, e.g, on GitHub.
Reviewers would be managed similar to the JOSS: There would a pool of reviewers to which anybody can add themselves, including which artifacts they would review, e.g., restricted by programming language, type of artifact, or research topic. Authors who want to submit an artifact must also provide a list of at least 5 suitable reviewers that can be used ty the editors to invite reviewers.
Once the artifact evaluation process is through, the badges can be added to the related publication in the digital library. Possibly, a DOI would be assigned to the artifact to further support the FAIR principles. This DOI would also include the reference to the short description submitted to the journal, as well as the review process.
How would the AJ solve problems?
Moritz raised many valid concerns. The AJ could address all of them.
Too many artifacts per reviewer: Solved through the open list of reviewers and the ability to just reject reviews. This is just a consequence of using PCs for artifact review.
Required Computational Power: Same as above. If you do not have the required computational power, you reject the review. Moreover, the guidelines could also state how such cases should be handled, e.g., what is acceptable. Moreover, the guidelines could enforce that resource requirements are always listed together with the artifacts, such that reviewers can make informed decision.
Nothing works out of the box: Guidelines with respect to virtualization. If authors opt-out of virtualizations, reviewers are free to reject the reviews. This would in principle allow to submit artifacts without virtualization, but authors would have a risk of not getting their artifacts reviewed
Bad presentation of results: Could again be handled by the guidelines, by required respective documentation.
Documentation chaos: Again, the guidelines.
Github not good for review: The way he described it, I agree. But JOSS demonstrates that this can work well in an open review process.
Is the artifact really working …: This is the only unresolved problem: can we actually validate the functional correctness of the artifact. From my point of view, this is out of scope of the artifact review. Guidelines my enforce this to some degree, e.g., by required authors to state which quality assurance measures they applied. The reviewer selection may also help, because if, e.g., a PhD students who wants to re-use the artifact is the reviewer, there is actually a strong motivation to check the correctness. But in general, we just also need a certain amount of trust that there is no cheating.
Which other problems would the AJ solve?
A well-defined process for badges for journal papers!
No separate artifact tracks at conferences anymore. This would actually reduce the work of conference organizers. Okay, it would be more like a shift of effort towards AJ, but likely still reduce redundant efforts.
This would be a good gateway to get involved in reviews and as reviewer for PhD students. Self-signup would also be good for the inclusiveness.
The editorial board could also be generated through applications and e.g., staffed mostly with postdocs for a fixed period of time. This way, we would not only train reviewers, but also editors.
Are there any downsides?
Not many that I can think of and none that I can think of that artifact tracks do not also have.
Yes, resources for the initial setup and dedication by the founding editorial board and EiCs are required. But the same is true for any artifact track out there, except that it is redone all the time.
Next Steps?
Discussion, finding people who want to do this (e.g., me ;)), and then engaging conferences/journals/SIGSOFT/IEEE to get them onboard.