Jay Patel
Jay Patel, Ph.D. Candidate
OASISlab / College of Information
University of Maryland
ORCID: https://orcid.org/0000-0003-1040-3607
Lanyard: lanyards.app/infotainment.bsky.social
University of Maryland - College Park
@infotainment.bsky.social
Session
As formal peer review is being strained, AI researchers and developers have proposed using AI systems to evaluate submissions. For this reason, I began synthesizing the evidence and asked: How accurately and efficiently can AI models and human-AI teams evaluate study reports relative to human experts in benchmarking experiments? In a living, AI-assisted systematic review, I screened in all high-quality experiments wherein foundation models and fine-tuned LLMs were evaluated against humans on close-ended and open-ended research evaluation tasks. In Phase 1, I have identified 38 study reports. To keep up with the swift current of publications, I welcome contributors to join me in writing a protocol and completing Phase 2 of this systematic review. Our continued efforts will inform us of what AI models can truly do and offer those who debate their value a shared evidence base to reason from and contribute to.