Logo image
Analyzing the Behavior of LLM Under Concurrency and Token-Based DoS Attacks
Conference proceeding   Peer reviewed

Analyzing the Behavior of LLM Under Concurrency and Token-Based DoS Attacks

Md Abdul Barek, A. B. M Kamrul Islam Riad, Md Bajlur Rashid, Guillermo Francia, Hossain Shahriar and Sheikh Iqbal Ahamed
IEEE International Conference on Dependable, Autonomic and Secure Computing (Online), pp.72-81
IEEE Conference on Dependable, Autonomic and Secure Computing (DASC) (Hakodate, Japan , 10/21/2025–10/24/2025)
10/21/2025

Metrics

1 Record Views

Abstract

A Large Language Model (LLM) is an AI system that uses deep learning and extensive data to comprehend, process, and produce human-like language. Globally, LLMs have shown promising results and have gained widespread acceptance. However, this widespread adoption and numerous applications of LLMs have simultaneously heightened their appeal as targets for malicious attackers. Attackers continuously develop new strategies to disrupt computer systems and also attack LLM in many ways. In addition, LLMs have intrinsic vulnerabilities and limitations that can be leveraged and exposed to a range of cyberattacks. Among these, Denial of Service (DoS) is a prevalent threat that disrupts the functionality of LLMs by inundating them with excessive input or even making them completely unavailable. In addition to traditional concurrent-based DoS attacks, we also explore a novel prompt-based attack, where repeating tokens within a single input prompt can internally overload the LLM and trigger failure, even without concurrency. This paper explores how LLMs react to such attacks, demonstrating their behavior under varying levels of DoS attacks, and comparing the outcomes. It also investigates how repeated token prompt structures can cause instability, revealing a new class of input-driven DoS vulnerabilities in LLMs. Our experiments on three open-source LLMs confirm that both high-concurrency and repeated token inputs can significantly degrade performance, increase response time, and even lead to system crashes under high load. Furthermore, to address the concurrency-based DoS attack, we have implemented and validated a queue-based mitigation approach in our companion work.

Details

Logo image