GPT4 is Slightly Helpful for Peer-Review Assistance: A Pilot Study. (arXiv:2307.05492v1 [cs.HC])

In this pilot study, we investigate the use of GPT4 to assist in the
peer-review process. Our key hypothesis was that GPT-generated reviews could
achieve comparable helpfulness to human reviewers. By comparing reviews
generated by both human reviewers and GPT models for academic papers submitted
to a major machine learning conference, we provide initial evidence that
artificial intelligence can contribute effectively to the peer-review process.
We also perform robustness experiments with inserted errors to understand which
parts of the paper the model tends to focus on. Our findings open new avenues
for leveraging machine learning tools to address resource constraints in peer
review. The results also shed light on potential enhancements to the review
process and lay the groundwork for further research on scaling oversight in a
domain where human-feedback is increasingly a scarce resource.



Related post