Logo image
Agentic Reasoning for Robust Vision Systems via Increased Test-Time Compute
Preprint   Open access

Agentic Reasoning for Robust Vision Systems via Increased Test-Time Compute

Chung-En (Johnny) Yu, Brian Jalaian and Nathaniel D Bastian
arXiv
09/19/2025

Metrics

2 File views/ downloads
13 Record Views

Abstract

Developing trustworthy intelligent vision systems for high-stakes domains, e.g., remote sensing and medical diagnosis, demands broad robustness without costly retraining. We propose Visual Reasoning Agent (VRA), a training-free, agentic reasoning framework that wraps off-the-shelf vision-language models and pure vision systems in a Think--Critique--Act loop. While VRA incurs significant additional test-time computation, it achieves up to 40\% absolute accuracy gains on challenging visual reasoning benchmarks. Future work will optimize query routing and early stopping to reduce inference overhead while preserving reliability in vision tasks.
pdf
Agentic Reasoning for Robust Vision Systems via Increased Test-Time Compute731.37 kBDownloadView
Preprint Preprint pdfCC BY V4.0 Open Access
url
Agentic Reasoning for Robust Vision Systems via Increased Test-Time ComputeView
Preprint link to articleCC BY V4.0 Open

Details

Logo image