List of works
Conference proceeding
Published 10/06/2025
MILCOM 2025 - 2025 IEEE Military Communications Conference (MILCOM)
IEEE Military Communications Conference (MILCOM), 10/06/2025–10/10/2025, Los Angeles, California, USA
Backdoor attacks pose a critical threat by embedding hidden triggers into inputs, causing models to misclassify them into adversary-chosen target labels. While extensive research has focused on mitigating these attacks in object recognition models through weight fine-tuning and other reactive strategies, much less attention has been given to detecting backdoored samples directly. Given the vast datasets used in training models, manual inspection for backdoor triggers is impractical, and even state-of-the-art defense mechanisms fail to fully neutralize their impact. To address this gap, we introduce a groundbreaking method to detect unseen backdoored images during both training and inference. Leveraging the transformative success of prompt tuning in Vision Language Models (VLMs), our approach trains learnable text prompts to differentiate clean images from those with hidden backdoor triggers. Comprehensive experiments on CIFAR-10 and GTSRB covering six diverse attack families demonstrate the robustness of our detector. When exposed to unseen backdoor threats, the learned prompts achieve an average 86% accuracy at distinguishing previously unseen backdoor images from clean ones, outperforming baselines by up to 30 percentage points. These results establish prompt-tuned VLMs as an effective first line of defense against backdoor threats. Code and datasets will be available.
Conference proceeding
A Novel Approach to Fine-tune BERT using Non-Text Features for Enhanced Ransomware Detection
Published 09/06/2025
2025 3rd International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AIBThings) September 06 – 07, 2025 Michigan, USA CONFERENCE PROCEEDINGS
International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AIBThings), 09/06/2025–09/07/2025, Mt Pleasant, Michigan, USA
The growing complexity and volume of ransomware attacks demand advanced detection techniques that can effectively model dependencies within high-dimensional data. Traditional machine learning methods often struggle to capture nuanced relationships among features in such cybersecurity datasets. To address this problem, we propose a novel technique that transforms structured, non-linguistic data into descriptive natural language formats. This conversion facilitates the tailored refinement of a Bidirectional Encoder Representations from Transformers (BERT) architecture with optimized parameters. By leveraging BERT's multi-head self-attention mechanism, our method embeds non-textual ransomware data into semantic textual data so that multi-heads can make relationships and dependencies of tokens in different perspectives to transform instances to comprehensive latent representations where interfeature dependencies are effectively present. This allows BERT to understand contextual relevance among tokens, leading to superior classification performance. Our evaluation demonstrates that the resulting model shows dominant performance, surpassing other advanced solutions, achieving a classification accuracy of 99.21%, surpassing ensemble models (99.0%) and LSTM-based approaches (98.5%). The important finding of this novel approach is that one-third of the data points have been used to outperform other existing works. It highlights its potential and adaptability in cybersecurity domains whenever a text dataset is absent to use the natural language context for the model.
Journal article
Reinforcement Learning in Medical Imaging: Taxonomy, LLMs, and Clinical Challenges
Published 08/30/2025
Future internet, 17, 9, 396
Reinforcement learning (RL) is being used more in medical imaging for segmentation, detection, registration, and classification. This survey provides a comprehensive overview of RL techniques applied in this domain, categorizing the literature based on clinical task, imaging modality, learning paradigm, and algorithmic design. We introduce a unified taxonomy that supports reproducibility, highlights design guidance, and identifies underexplored intersections. Furthermore, we examine the integration of Large Language Models (LLMs) for automation and interpretability, and discuss privacy-preserving extensions using Differential Privacy (DP) and Federated Learning (FL). Finally, we address deployment challenges and outline future research directions toward trustworthy and scalable medical RL systems.
Conference proceeding
Published 08/26/2025
Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, 1136 - 1145
Annual Computers, Software, and Applications Conference (COMPSAC), 07/08/2025–07/11/2025, Toronto, Ontario, Canada
The increasing use of high-dimensional imaging in medical AI raises significant privacy and security concerns. This paper presents a Bootstrap Your Own Latent (BYOL)-based self-supervised learning (SSL) framework for secure image processing, ensuring compliance with HIPAA and privacy-preserving machine learning (PPML) techniques. Our method integrates federated learning, homomorphic encryption, and differential privacy to enhance security while reducing dependence on labeled data. Experimental results on the MNIST and NIH Chest X-ray datasets demonstrate a classification accuracy of 97.5% and 99.99% (pre-fine-tuning 40%), with improved clustering performance using K-Means (Silhouette Score: 0.5247). These findings validate BYOL's capability for robust, privacy-preserving image processing while emphasizing the need for fine-tuning to optimize classification performance.
Conference proceeding
Embedding with Large Language Models for Classification of HIPAA Safeguard Compliance Rules
Published 08/26/2025
Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, 1040 - 1046
Annual Computers, Software, and Applications Conference (COMPSAC), 07/08/2025–07/11/2025, Toronto, Ontario, Canada
Although software developers of mHealth apps are responsible for protecting patient data and adhering to strict privacy and security requirements, many of them lack awareness of HIPAA regulations and struggle to distinguish between HIPAA rules categories. Therefore, providing guidance of HIPAA rules patterns classification is essential for developing secured applications for Google Play Store. In this work, we identified the limitations of traditional Word2Vec embeddings in processing code patterns. To address this, we adopt multilingual BERT (Bidirectional Encoder Representations from Transformers) which offers contextualized embeddings to the attributes of dataset to overcome the issues. Therefore, we applied this BERT to our dataset for embedding code patterns and then uses these embedded code to various machine learning approaches. Our results demonstrate that the models significantly enhances classification performance, with Logistic Regression achieving a remarkable accuracy of 99.95%. Additionally, we obtained high accuracy from Support Vector Machine (99.79%), Random Forest (99.73%), and Naive Bayes (95.93%), outperforming existing approaches. This work underscores the effectiveness and showcases its potential for secure application development.
Conference proceeding
Vulnerability to Stability: Scalable Large Language Model in Queue-Based Web Service
Published 08/26/2025
Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, 995 - 1000
Annual Computers, Software, and Applications Conference (COMPSAC), 07/08/2025–07/11/2025, Toronto, Ontario, Canada
Large Language Models (LLMs) have demonstrated exceptional capabilities in the field of Artificial Intelligence (AI) and are now widely used in various applications globally. However, one of their major challenges is handling high-concurrency workloads, especially under extreme conditions. When too many requests are sent simultaneously, LLMs often become unresponsive which leads to performance degradation and reduced reliability in real-world applications. To address this issue, this paper proposes a queue-based system that separates request handling from direct execution. By implementing a distributed queue, requests are processed in a structured and controlled manner, preventing system overload and ensuring stable performance. This approach also allows for dynamic scalability, meaning additional resources can be allocated as needed to maintain efficiency. Our experimental results show that this method significantly improves resilience under heavy workloads which prevents resource exhaustion and enables linear scalability. The findings highlight the effectiveness of a queue-based web service in ensuring LLMs remain responsive even under extreme workloads.
Conference proceeding
Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection
Published 08/26/2025
Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, 1033 - 1039
Annual Computers, Software, and Applications Conference (COMPSAC), 07/08/2025–07/11/2025, Toronto, Ontario, Canada
Large language models (LLMs) are becoming a popular tool as they have significantly advanced in their capability to tackle a wide range of language-based tasks. However, LLMs applications are highly vulnerable to prompt injection attacks, which poses a critical problem. These attacks target LLMs applications through using carefully designed input prompts to divert the model from adhering to original instruction, thereby it could execute unintended actions. These manipulations pose serious security threats which potentially results in data leaks, biased outputs, or harmful responses. This project explores the security vulnerabilities in relation to prompt injection attacks. To detect whether a prompt is vulnerable or not, we follows two approaches: 1) a pre-trained LLM, and 2) a fine-tuned LLM. Then, we conduct a thorough analysis and comparison of the classification performance. Firstly, we use pre-trained XLM-RoBERTa model to detect prompt injections using test dataset without any fine-tuning and evaluate it by zero-shot classification. Then, this proposed work will apply supervised fine-tuning to this pre-trained LLM using a task-specific labeled dataset from deepset in huggingface, and this fine-tuned model achieves impressive results with 99.13% accuracy, 100% precision, 98.33% recall and 99.15% F1-score thorough rigorous experimentation and evaluation. We observe that our approach is highly efficient in detecting prompt injection attacks.
Conference proceeding
White-box Fuzzing in the Wild: A Chaos Engineering Module for DevOps Security Education
Published 08/26/2025
Proceedings: 2025 IEEE 49th Annual Computers, Software, and Applications Conference COMPSAC 2025, 2387 - 2393
Annual Computers, Software, and Applications Conference (COMPSAC), 07/08/2025–07/11/2025, Toronto, Ontario, Canada
In today's fast-paced software development environments, DevOps has revolutionized the way teams build, test, and deploy applications by emphasizing automation, collaboration, and continuous integration/continuous delivery (CI/CD). However, with these advancements comes an increased need to address security proactively, giving rise to the DevSecOps movement, which integrates security practices into every phase of the software development lifecycle. DevOps security remains underrepresented in academic curricula despite its growing importance in the industry. To address this gap, this paper presents a hands-on learning module that combines Chaos Engineering and White- box Fuzzing to teach core principles of secure DevOps practices in an authentic, scenario-driven environment. Chaos Engineering allows students to intentionally disrupt systems to observe and understand their resilience, while White-box Fuzzing enables systematic exploration of internal code paths to discover corner- case vulnerabilities that typical tests might miss. The module was deployed across three academic institutions, and both pre- and post-surveys were conducted to evaluate its impact. Pre-survey data revealed that while most students had prior experience in software engineering and cybersecurity, the majority lacked exposure to DevOps security concepts. Post-survey responses gathered through ten structured questions showed highly positive feedback 66.7% of students strongly agreed, and 22.2% agreed that the hands-on labs improved their understanding of secure DevOps practices. Participants also reported increased confidence in secure coding, vulnerability detection, and resilient infrastructure design. These findings support the integration of experiential learning techniques like chaos simulations and white-box fuzzing into security education. By aligning academic training with real- world industry needs, this module effectively prepares students for the complex challenges of modern software development and operations.
Conference proceeding
Large Language Model can Reduce the Necessity of Using Large Data Samples for Training Models
Published 05/05/2025
Proceedings 2025 IEEE Conference on Artificial Intelligence (CAI), 988 - 991
IEEE Conference on Artificial Intelligence (CAI), 05/05/2025–05/07/2025, Santa Clara, California, USA
This work introduces an novel approach to improving cybersecurity systems to focus on spam email-based cyberattacks. The proposed technique tackles the challenge of training Machine Learning (ML) models with limited data samples by leveraging Bidirectional Encoder Representations from Transformers (BERT) for contextualized embeddings. Unlike traditional embedding methods, BERT offers a nuanced representation of smaller datasets, enabling more effective ML model training. The methodology will use several pretrained BERT models for generating contextualized embeddings using data samples, and these embeddings will be fed to various ML algorithms for effective training. This approach demonstrates that even with scarce data, BERT embeddings significantly enhance model performance compared to conventional embedding approaches like Word2Vec. The technique proves especially advantageous for insufficient instances of high-quality dataset. The result of this proposed work outperforms traditional techniques to mitigate phishing attacks with few data samples. This work provides a robust accuracy of 99.25% when we use multilingual BERT (M-BERT) to embed dataset.
Conference proceeding
On the Design and Visualization of Connected Vehicle Security Metrics
Published 04/16/2025
Proceedings of the Third International Conference on Advances in Computing Research (ACR’25), 1346, 358 - 374
International Conference on Advances in Computing Research (ACR’25)
The rapid advancement of connected and autonomous vehicles created new challenges for security and safety professionals. The sophistication of vehicle communication systems, located externally and internally, provides an added complexity to the issue. In security parlance, this is an expansion of the attack surface on vehicles. These challenges prompted the enhancement of existing and the development of new safety and security standards initiated by government, industry, and trade organizations. These initiatives clearly underscore the need to examine the state of connected vehicle security and develop effective security metrics. As a major component of continuous improvement, quantitative and qualitative measures must be devised to be able to make a full appreciation of the process. This paper builds upon previous research on connected vehicle security metrics, offers new metrics, and proposes visualization systems to enhance their utilization.