The peer review process is strained by increasing submission volumes, reviewer fatigue, and inconsistent standards. While Large Language Models (LLMs) can aid in reviews, they are often overly optimistic and lack technical depth. We developed an innovative prompting strategy that, when applied to ChatGPT-4 on ICLR 2025 papers, reduced score inflation and generated reviews more closely aligned with human reviewer median scores.
Related links
Details
Title
Toward Human-Aligned LLM Reviews for Scientific Papers
Publication Details
Proceedings IEEE International Conference on e-Science: eScience 2025, pp.363-364
Resource Type
Conference proceeding
Conference
IEEE International Conference on e-Science: eScience 2025 (Chicago, Illinois, USA, 09/15/2025–09/18/2025)