Review instructions
Your goal as an artifact reviewer is to ensure that the artifact’s quality matches the paper’s content and the minimum requirements expected to obtain each badge.
Note: the review period is relatively short. We recommend starting your reviews as soon as you receive your assignment, since two badges require running the artifact.
Steps to evaluate an artifact
The review can be carried out in an environment of your choice, as long as it satisfies the minimum requirements of the artifact’s expected execution environment. We recommend running the artifact (when applicable) in a virtual environment, as it is convenient for reviewers and ensures that components on your local machine do not interfere with the evaluation (a clean install in a fresh environment can reduce unexpected issues).
All additional resources needed to run the artifact (cloud infrastructure, SSH keys, etc.) must be present in the appendix that describes the artifact.
The artifact under evaluation is related to a paper being evaluated by the conference’s technical committees. A CTA reviewer’s focus is on the artifact, not on reviewing the paper. However, if any problem is found, it must be reported to the artifact evaluation chairs.
Note: remember that all artifacts, analyses, and discussions are confidential.
Review process
As soon as an artifact is assigned to you for review, you can begin. The earlier you start, the better, as it allows problems to be found and discussed with the authors. This year the review has two stages:
In the first stage (r1), reviewers evaluate the artifact considering the evaluation criteria. During this process, messages can be posted on the hotcrp platform — discussions among review committee members or questions for the authors, such as questions about problems found in the artifact. At the end of the review you must submit an assessment that will be shown to the authors. It should highlight the steps taken to evaluate each badge, the execution process observed, and the result achieved (problems in the execution process must be clearly explained in the review). Authors will respond to the points raised in this stage during the rebuttal phase.
The second stage (r2-decision) occurs after the authors’ rebuttal phase. There, based on the first-stage review, authors clarify questions, solve problems found, point out any mistakes, and/or explain something the reviewers may have missed. Your role as a reviewer in the second stage is to consider the points raised in the first stage and in the rebuttal to decide which badges should be assigned or not.
Review cycle calendar:
- Artifact submission deadline
- Review Round 1 (r1)
- Rebuttal phase
- Reviewer decision (r2-decision)
Dates are available on hotcrp.
Note: try to write your review precisely, impersonally, and politely, considering that it will be available to the authors in a later phase of the process.
Evaluation criteria
To carry out this activity with excellence, you must consider the four badges and their respective minimum requirements for assigning a badge.
Available Artifacts (Badge D)
The code and/or data are expected to be available in a stable repository (such as GitHub or GitLab). This repository is expected to contain a README.md with the minimum README.md requirements.
Functional Artifacts (Badge F)
The code and/or artifact is expected to be executable so the reviewer can observe some of its functionality. To obtain this badge, additional information should be present in the repository’s README.md, such as a list of dependencies, a list of versions of dependencies/languages/environment, a description of the execution environment, installation and execution instructions, and a minimal execution example.
Note: as a reviewer, in addition to verifying that the artifact meets the respective criteria, you must run the artifact. Your review is expected to include proof of execution, with some of the outputs presented by the tool.
Sustainable Artifacts (Badge S)
The code and/or artifact is expected to be modular, organized, intelligible, and easy to understand. To obtain the badge it is advisable that there is minimal code documentation (describing files, functions, …), minimal code readability, and that evaluators can identify the paper’s main claims within the artifact.
Reproducible Experiments (Badge R)
The reviewer is expected to be able to reproduce the paper’s main claims. To obtain this badge, instructions to run the main claims (e.g., the main figures/tables results) and a description of how the experiments were run to reach the paper’s results are expected.
Note: to assign the badge you must reproduce (run) the experiments presented in the paper using the content found in the artifact, reaching the claims found in the paper and reproducing tables and figures. Your review is expected to include a summary of these results.
Delivering reviews
For each artifact, you must produce a brief review justifying the reason for assigning or denying a badge to the artifact. This evaluation should only be completed after the evaluation process has been carried out. To ease the process, an example is available alongside the submission form.
Best papers
To assign the best paper awards, one of the criteria will be the rating assigned by reviewers in the “Distinguished Artifact Award Candidate” category. Thus, reviewers are expected to assign higher scores (3 and 4) to works with at least 3 badges that stand out in quality relative to the others. Furthermore, works that did not obtain more than two badges are expected to not score above the minimum (1).