A Visual Question Answering-based Object Detection Framework using a Team of Multi-Agent UAVs

Stephane Lee; Nathan Rayon; Hakki Erhan Sevil

Back

A Visual Question Answering-based Object Detection Framework using a Team of Multi-Agent UAVs

Conference paper

Open access

Peer reviewed

A Visual Question Answering-based Object Detection Framework using a Team of Multi-Agent UAVs

Stephane Lee, Nathan Rayon and Hakki Erhan Sevil

Florida Conference on Recent Advances in Robotics (FCRAR 2025), 38th (Dania Beach, Florida, USA, 05/08/2025–05/09/2025)

05/09/2025

Metrics

5 File views/ downloads

35 Record Views

Abstract

This paper presents an autonomous multi-agent unmanned aerial vehicle (UAV) system designed to perform object detection through Visual Question Answering (VQA) using aerial imagery. The system utilizes an entropy-based distributed behavior model to coordinate UAV movements toward designated waypoints. A VQA model is used to analyze aerial footage for detection of objects of interest. The study investigates the impact of various distributed behavior configurations, including number of UAVs, UAV formations, flight altitude, and separation distance. After analysis, a final optimized configuration for maximizing surface area coverage and VQA model performance were found. These findings contribute to the development of aerial systems capable of collaborative visual reasoning in complex environments.

Files and links (1)

pdf

A Visual Question Answering-based Object Detection Framework using a Team of Multi-Agent UAVs2.87 MBDownload View

PresentationConference paper pdf Open Access

Details

Title: A Visual Question Answering-based Object Detection Framework using a Team of Multi-Agent UAVs
Resource Type: Conference paper
Conference: Florida Conference on Recent Advances in Robotics (FCRAR 2025), 38th (Dania Beach, Florida, USA, 05/08/2025–05/09/2025)
Format: pdf
Number of pages: 5
Copyright: Permission granted to the University of West Florida Libraries by the author to digitize and/or display this information for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires the permission of the copyright holder.
Identifiers: 99381348025206600
Academic Unit: Intelligent Systems and Robotics
Language: English

A Visual Question Answering-based Object Detection Framework using a Team of Multi-Agent UAVs

Metrics

Abstract

Files and links (1)

Related links

Details

University of West Florida Social media