List of works
Conference proceeding
Forensic Investigation of Synthetic Voice Spoofing Detection in Social App
Published 05/08/2025
ACMSE 2025: Proceedings of the 2025 ACM Southeast Conference, 263 - 268
ACMSE 2025: ACM Southeast Conference, 04/24/2025–04/26/2025, Cape Girardeau, Missouri, USA
With the rapid growth of social applications, the misuse of synthetic voice generation technologies poses a significant security threat. Voice spoofing, where artificial voices are generated to impersonate real individuals, is a growing concern in various domains, including online communication, authentication, and social media interactions. This paper uses deep learning techniques to present a forensic investigation into the detection of synthetic voice spoofing within social apps. This study integrates a Convolutional Neural Network (CNN) with a Temporal Convolutional Network (TCN) in a hybrid architecture. A lightweight MobileNet CNN first extracts spatial features from Mel-Spectrograms, which are then analyzed by the TCN to capture sequential patterns. Using the fake-or-real (FoR) dataset, the for-norm dataset, this model achieved a training precision of 99.89% and validation accuracy of 99.79% and for-rerec dataset the model achieved a training precision of 99.79% and validation accuracy of 94.22%. Evaluation metrics, including the precision-recall curve with an average precision of 99% and the ROC curve with an AUC of 99%, underscore the model's robustness in distinguishing real from synthetic audio, offering a reliable solution for real-time deployment in resource-constrained environments.
Conference proceeding
Academic Advising Chatbot Powered with AI Agent
Published 05/08/2025
ACMSE 2025: Proceedings of the 2025 ACM Southeast Conference, 195 - 202
ACMSE 2025: 2025 ACM Southeast Conference, 04/24/2025–04/26/2025, Cape Girardeau, Missouri, USA
Academic advising plays a crucial role in fostering student success. However, challenges such as limited advisor availability can hinder effective support. Generative AI, particularly AI-powered chatbots, offers the potential to enhance student advising in higher education by providing personalized guidance. These technologies help college students find the information and resources needed to create degree plans aligned with their academic goals. This research introduces ARGObot, an intelligent advising system that facilitates student navigation of university policies through automated interpretation of the student handbook as its primary knowledge base. ARGObot enhances accessibility to critical academic policies and procedures, supporting incoming students' success through personalized guidance. Our system integrates a multifunctional agent enhanced by a Large Language Model (LLM). The architecture employs multiple external tools to enhance its capabilities: a Retrieval-Augmented Generation (RAG) system accesses verified university sources; email integration facilitates Human-in-the-Loop (HITL) interaction; and a web search function expands the system's knowledge base beyond predefined constraints. This approach enables the system to provide contextually relevant and verified responses to various student queries. This architecture evolved from our initial implementation based on Gemini 1 Pro, which revealed significant limitations due to its lack of agent-based functionality, resulting in hallucination issues and irrelevant responses. Subsequent evaluation demonstrated that our enhanced version, integrating GPT-4 with the text-embedding-ada-002 model, achieved superior performance across all metrics. This paper also presents a comparative analysis of both implementations, highlighting the architectural improvements and their impact on system performance.
Conference proceeding
Seeing the Unseen: A Forecast of Cybersecurity Threats Posed by Vision Language Models
Published 12/15/2024
2024 IEEE International Conference on Big Data (BigData), 5664 - 5673
IEEE International Conference on Big Data, 12/15/2024–12/18/2024, Washington, DC, USA
Despite the proven efficacy of large language models (LLMs) like GPT in numerous applications, concerns have emerged regarding their exploitation in creating phishing emails or network intrusions, which have shown to be detrimental. The multimodal functionalities of large vision-language models (LVLMs) enable them to grasp visual commonsense knowledge. This study investigates the feasibility of using two widely available commercial LVLMs, LLAVA, and multimodal GPT4, for effectively bypassing CAPTCHAs or producing bot-driven fraud through malicious prompts. It was found that these LVLMs can interpret and respond to the visual information presented in image, puzzle, and text-based CAPTCHA and reCAPTCHA, thereby potentially circumventing the challenge-response authentication security measure. This capability suggests that such systems could facilitate unauthorized access to secured accounts via remote digital methods. Remarkably, these attacks can be executed with the standard, unaltered versions of the LVLMs, eliminating the need for previous adversarial methods like jailbreaking.
Conference proceeding
AXNav: Replaying Accessibility Tests from Natural Language
Published 05/11/2024
CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 962
CHI '24: Conference on Human Factors in Computing Systems, 05/11/2024–05/16/2024, Honolulu, Hawaii, USA
Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs. However, to our knowledge, no one has yet explored the use of LLMs in controlling assistive technologies for the purposes of supporting accessibility testing. In this paper, we explore the requirements of a natural language based accessibility testing workflow, starting with a formative study. From this we build a system that takes a manual accessibility test instruction in natural language (e.g., “Search for a show in VoiceOver”) as input and uses an LLM combined with pixel-based UI Understanding models to execute the test and produce a chaptered, navigable video. In each video, to help QA testers, we apply heuristics to detect and flag accessibility issues (e.g., Text size not increasing with Large Text enabled, VoiceOver navigation loops). We evaluate this system through a 10-participant user study with accessibility QA professionals who indicated that the tool would be very useful in their current work and performed tests similarly to how they would manually test the features. The study also reveals insights for future work on using LLMs for accessibility testing.
Conference proceeding
Investigating Gender and Racial Bias in ELECTRA
Published 12/2022
2022 International Conference on Computational Science and Computational Intelligence (CSCI), 127 - 133
International Conference on Computational Science and Computational Intelligence (CSCI), 12/14/2022–12/16/2022, Las Vegas, Nevada, USA
With the increased adaptation of natural language processing models in industrial applications such as hiring and recruitment, chatbots, social media monitoring, and targeted advertising, pretrained language models (PTM) need fair and equal behavior across all ranges of demographic groups. ELECTRA has substantially outperformed BERT by predicting the original identities of the corrupted tokens over all input tokens rather than just the small subset that was masked out. Considering such enhancement and the 1/4 less amount of computing required by ELECTRA, it can be one of the most suitable industrial applications. Therefore, it is crucial to understand its underlying architecture and tokenization protocol to identify any potential discrimination towards specific groups. This paper presents a fair operation from ELECTRAs' pretrained network that shows the accurate classification of token replacements. This result is achieved via using a dataset with racially and gender-associated personal names, finetuning ELECTRA with the general language understanding evaluation (GLUE) benchmark, which analyzes the interactions of encoders and decoders using the Contextualized Embedding Association Test (CEAT) and sentiment association test. In addition, this paper will demonstrate that ELECTRA can achieve Bias-aware Fair prediction with higher accuracy on downstream tasks after fully trained. This project is investigating the prediction of generator and discriminator on an initial word's token using the Named Entity Recognition (NER), and Part of Speech tagging (POS)
Conference proceeding
Targeted Data Extraction and Deepfake Detection with Blockchain Technology
Published 10/22/2022
2022 6th International Conference on Universal Village (UV), 1 - 7
International Conference on Universal Village (UV), 10/22/2022–10/25/2022, Boston, Massachusetts, USA
By recording instances of significant forensic relevance, smartphones, which are becoming increasingly crucial for documenting ordinary life events, can produce pieces of evidence in court. Due to privacy or other issues, not everyone is open to having all the data on their phone collected and analyzed. In addition, Law Enforcement Organizations need a lot of memory to keep the information taken from a witness's phone. Deepfakes which are purposefully utilized as a source of disinformation, manipulation, harassment, and persuasion in court, present another significant problem for law enforcement organizations. Recently, the introduction of blockchain has altered the way we conduct business. Decentralized Applications (Dapps) may be a fantastic way to verify the accuracy of the data, stop the spread of false information, extract specific data with precision, and offer a framework for sharing that takes into account privacy and memory issues. This article outlines the creation of a Dapp that provides users with a secure conduit through distributing evidence that has been verified. By utilizing machine learning (ML) classifiers, this platform not only distinguishes between altered and original material before allowing it, but also uses user-uploaded media to retrain its models to increase prediction accuracy and offer complete transparency. The end outcome of this activity can maintain a clear record (timestamp) of the occurrence, submitted proof, and helpful metadata with the aid of the blockchains' consensus notion.
Conference proceeding
Digital Evidence Acquisition and Deepfake Detection with Decentralized Applications
Published 07/22/2022
PEARC '22: Practice and Experience in Advanced Research Computing 2022: Revolutionary: Computing, Connections, You, 87
PEARC '22: Revolutionary: Computing, Connections, You, 07/10/2022–07/14/2022, Boston, Massachusetts, USA
EXCERPT: Given the rise of digital technology and communication, there’s a higher chance of smartphones containing shreds of evidence related to an incident. The variety of digital evidence sources, creation and sharing of information, and incidents within forums, and other Online broadcasting medium poses new and challenging problems for digital investigators. Three of the most significant obstacles are as follows: 1) Authentication of the evidence, 2) Acquisition 3) Storage and analysis. Blockchain, by offering a decentralized
network and an IPFS hash storage system, can be a great solution to the acquisition and storage challenges. Machine learning (ML), as one of the leading solutions to the identification and authentication of evidence, can provide the best performance in the detection of deepfake media. Our proposed framework, by combining machine learning and the decentralized nature of Dapps is designed to offer authenticity, immutability, traceability, robustness, and distributed trust between evidence entitles and examiners. To be able to keep the storage cost and resources minimal, avoid the whole process of consent/warrant form, extract the relevant data only, our implementation is based on the assumption of voluntary media upload
by those who were present at the crime scene.
Conference proceeding
Applying Machine Learning to Analyze Anti-Vaccination on Tweets
Published 12/15/2021
2021 IEEE International Conference on Big Data (Big Data), 4426 - 4430
IEEE International Conference on Big Data (Big Data), 12/15/2021–12/18/2021, Orlando, Florida, USA
Inspection of Anti-COVID vaccination tweets can be useful for many such analyses, and extraction of relevant information about opinion expressed on Twitter. This study proposes an analytical framework for analyzing tweets (COVID Vaccine, especially the Anti- COVID Vaccine) to identify and categorize fine-grained details about the COVID19 disaster such as affected individuals, public feelings towards the vaccine and reopening of business, polarity of public opinions on the vaccine and services provided, discussed topic changing over temporal dimension, and different clustering algorithms. In this project, we have analyzed COVID -Vaccine related tweets and Anti-Vaccine tweets, performed sentiment analysis and Topic modeling, and compared various models' behavior based on different configuration and training datasets. The result of this work will help policy makers and data scientists to identify the best approach for twitter sentiment analysis and topic modeling as well as providing feedback on people attitude and opinion on COVID-19 vaccine.
Conference proceeding
A Personalized Learning Framework for Software Vulnerability Detection and Education
Published 11/2021
2021 International Symposium on Computer Science and Intelligent Controls (ISCSIC), 119 - 126
International Symposium on Computer Science and Intelligent Controls (ISCSIC), 11/12/2021–11/14/2021, Rome, Italy
The software has become a necessity for many different societal industries including, technology, health care, public safety, education, energy, and transportation. Therefore, training our future software developers to write secure source code is in high demand. With the advent of data-driven techniques, there is now a growing interest in leveraging machine learning and natural language processing (NLP) as a source code assurance method to build trustworthy systems. In this work, we propose a framework including learning modules and hands-on labs to guide future IT professionals towards developing secure programming habits and mitigating source code vulnerabilities at the early stages of the software development lifecycle following the concept of Secure Software Development Life Cycle (SSDLC). In this research, our goal is to prepare a set of hands-on labs that will introduce students to secure programming habits using source code and log file analysis tools to predict, identify, and mitigate vulnerabilities. In summary, we develop a framework which will (1) improve students' skills and awareness on source code vulnerabilities, detection tools and mitigation techniques (2) integrate concepts of source code vulnerabilities from Function, API and library level to bad programming habits and practices, (3) leverage deep learning, NLP and static analysis tools for log file analysis to introduce the root cause of source code vulnerabilities.
Conference proceeding
Inherent Discriminability of BERT Towards Racial Minority Associated Data
Published 2021
Computational Science and Its Applications – ICCSA 2021, Volume 3, 12951, 256 - 271
International Conference on Computational Science and Applications (ICCSA 2021), 09/13/2021–09/16/2021, Cagliari, Italy
AI and BERT (Bidirectional Encoder Representations from Transformers) have been increasingly adopted in the human resources (HR) industry for recruitment. The increased efficiency (e.g., fairness) will help remove biases in machine learning, help organizations find a qualified candidate, and remove bias in the labor market. BERT has further improved the performance of language representation models by using an auto-encoding model which incorporates larger bidirectional contexts. However, BERT's underlying mechanisms that enhance its effectiveness, such as tokenization, masking, and leveraging the attention mechanism to compute vector score, are not well understood.
This research analyzes how BERT's architecture and its tokenization protocol affect the low number of occurrences of the minority-related data using the cosine similarity of its embeddings. In this project, by using a dataset of racially and gender-associated personal names and analyzing the interactions of transformers, we present the unfair prejudice of BERTs' pre-trained network and autoencoding model. Furthermore, by analyzing the distance of an initial word's token and its MASK replacement token using the cosine similarity, we will demonstrate the inherent discriminability during pre-training. Finally, this research will deliver potential solutions to mitigate discrimination and bias in BERT by examining its geometric properties.