Florida Conference on Recent Advances in Robotics (FCRAR 2025), 38th (Florida Atlantic University, Dania Beach, Florida, USA, 05/07/2025–05/08/2025)
05/08/2025
Metrics
7 File views/ downloads
33 Record Views
Abstract
Image segmentation is essential for navigation and scene understanding in autonomous systems, particularly in unstructured outdoor environments. This study investigates the segmentation capabilities of DALL-E 3, a generative text-to-image model, that is not explicitly trained for semantic segmentation.
A custom segmentation pipeline was developed to evaluate and refine DALL-E 3 outputs on outdoor images from the RELLIS-3D dataset. The post-processing workflow includes morphological operations with varied structure elements to enhance segmentation accuracy. Segmentation accuracy was assessed using mean Intersection over Union (mIoU) across selected terrain classes. Results show that the raw DALL-E 3 outputs were improved after
developed post-processing refinement, and resulting accuracy values are competitive with supervised models, HRNet+OCR and GSCNN. These results demonstrate that text-to-image models, when paired with domain-aware post-processing, offer a promis-ing alternative for flexible, rapid-deployment segmentation for universal robotics without requiring labeled training data. These efforts contribute to our research team’s broader goal of enabling
intelligent mobile robots capable of autonomous perception and decision-making in complex environments.
Files and links (1)
pdf
Text-to-Image Model-based Image Segmentation for Scene Understanding in Autonomous Robot Navigation7.68 MBDownloadView
Published (Version of record)Conference paper pdf Open Access
Related links
Details
Title
Text-to-Image Model-based Image Segmentation for Scene Understanding in Autonomous Robot Navigation
Resource Type
Conference paper
Conference
Florida Conference on Recent Advances in Robotics (FCRAR 2025), 38th (Florida Atlantic University, Dania Beach, Florida, USA, 05/07/2025–05/08/2025)
Number of pages
5
Copyright
Permission granted to the University of West Florida Libraries by the author to digitize and/or display this information for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires the permission of the copyright holder.
Identifiers
99381348025106600
Academic Unit
Intelligent Systems and Robotics
Language
English
Access the Argo Scholar Commons Lib Guide
Return to the libraries' main page
Access answers to the questions we get the most
Text-to-Image Model-based Image Segmentation for Scene Understanding in Autonomous Robot Navigation