Mitigating Large Vision-Language Model Hallucination at Post-hoc via Multi-agent System

Chung-En (Johnny) Yu; Brian Jalaian; Nathaniel D. Bastian

doi:10.1609/aaaiss.v4i1.31780

Back

Conference proceeding

Mitigating Large Vision-Language Model Hallucination at Post-hoc via Multi-agent System

Chung-En (Johnny) Yu, Brian Jalaian and Nathaniel D. Bastian

Proceedings of the AAAI Symposium Series, Vol.4: AI Trustworthiness and Risk Assessment for Challenging Contexts (ATRACC) - Short Papers, pp.110-113

The Association for the Advancement of Artificial Intelligence’s 2024 Fall Symposium (Arlington, Virginia, USA, 11/07/2024–11/09/2024)

11/08/2024

DOI: https://doi.org/10.1609/aaaiss.v4i1.31780

Metrics

9 Record Views

Abstract

This paper addresses the critical issue of hallucination in Large Vision-Language Models (LVLMs) by proposing a novel multi-agent framework. We integrate three post-hoc correction techniques: self-correction, external feedback, and agent debate, to enhance LVLM trustworthiness. Our approach tackles key challenges in LVLM hallucination, including weak visual encoders, parametric knowledge bias, and loss of visual attention during inference. The framework employs a Plug-in LVLM as the base model to reduce its hallucination, a Large Language Model (LLM) for guided refinement, external toolbox models for factual grounding, and an agent debate system for consensus-building. While promising, we also discuss potential limitations and technical challenges in implementing such a complex system. This work contributes to the ongoing effort to create more reliable and trustworthy multimodal multi-agent systems.

Details

Title: Mitigating Large Vision-Language Model Hallucination at Post-hoc via Multi-agent System
Publication Details: Proceedings of the AAAI Symposium Series, Vol.4: AI Trustworthiness and Risk Assessment for Challenging Contexts (ATRACC) - Short Papers, pp.110-113
Resource Type: Conference proceeding
Conference: The Association for the Advancement of Artificial Intelligence’s 2024 Fall Symposium (Arlington, Virginia, USA, 11/07/2024–11/09/2024)
Publisher: The AAAI Press; Washington, DC, USA
Number of pages: 4
Identifiers: 99381512426606600
Academic Unit: Intelligent Systems and Robotics; Hal Marcus College of Science and Engineering
Language: English

Mitigating Large Vision-Language Model Hallucination at Post-hoc via Multi-agent System

Metrics

Abstract

Related links

Details

University of West Florida Social media