Vulnerability to Stability: Scalable Large Language Model in Queue-Based Web Service

MD Abdul Barek; Md Bajlur Rashid; Md Mostafizur Rahman; A.B.M Kamrul Islamc Riad; Guillermo Francia; Hossain Shahriar; Sheikh Iqbal Ahamed

doi:10.1109/COMPSAC65507.2025.00129

Back

Vulnerability to Stability: Scalable Large Language Model in Queue-Based Web Service

Conference proceeding

Peer reviewed

Vulnerability to Stability: Scalable Large Language Model in Queue-Based Web Service

MD Abdul Barek, Md Bajlur Rashid, Md Mostafizur Rahman, A.B.M Kamrul Islamc Riad, Guillermo Francia, Hossain Shahriar and Sheikh Iqbal Ahamed

Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, pp.995-1000

IEEE Annual International Computer Software and Applications Conference

Annual Computers, Software, and Applications Conference (COMPSAC), 49th (Toronto, Ontario, Canada, 07/08/2025–07/11/2025)

08/26/2025

DOI: https://doi.org/10.1109/COMPSAC65507.2025.00129

Web of Science ID: WOS:001575960000122

Metrics

12 Record Views

Abstract

Large Language Models (LLMs) have demonstrated exceptional capabilities in the field of Artificial Intelligence (AI) and are now widely used in various applications globally. However, one of their major challenges is handling high-concurrency workloads, especially under extreme conditions. When too many requests are sent simultaneously, LLMs often become unresponsive which leads to performance degradation and reduced reliability in real-world applications. To address this issue, this paper proposes a queue-based system that separates request handling from direct execution. By implementing a distributed queue, requests are processed in a structured and controlled manner, preventing system overload and ensuring stable performance. This approach also allows for dynamic scalability, meaning additional resources can be allocated as needed to maintain efficiency. Our experimental results show that this method significantly improves resilience under heavy workloads which prevents resource exhaustion and enables linear scalability. The findings highlight the effectiveness of a queue-based web service in ensuring LLMs remain responsive even under extreme workloads.

Details

Title: Vulnerability to Stability
Publication Details: Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, pp.995-1000
Resource Type: Conference proceeding
Conference: Annual Computers, Software, and Applications Conference (COMPSAC), 49th (Toronto, Ontario, Canada, 07/08/2025–07/11/2025)
Publisher: IEEE; ieeexplore
Series: IEEE Annual International Computer Software and Applications Conference
Number of pages: 6
Grant note: National Science Foundation: 2433800, 2421324, 1946442 National Institutes of Health: 5R42LM014356 03
The work is supported by the National Science Foundation under Award #2433800, #2421324, #1946442 and National Institutes of Health Grant #5R42LM014356 03 Any opinions, findings, recommendations, expressed in this material are those of the authors and do not necessarily reflect the views of the NSF and NIH.
Identifiers: WOS:001575960000122; 99381484295006600
Academic Unit: Center for Cybersecurity and AI
Language: English

Vulnerability to Stability: Scalable Large Language Model in Queue-Based Web Service

Metrics

Abstract

Related links

Details

University of West Florida Social media