Cybersecurity is a crucial field where data instances are continuously evolving as attackers attempt to exploit systems with novel attack patterns. Intrusion detection systems represent a key area for ensuring cyber safety. Machine learning and deep learning-based network intrusion detection systems have demonstrated promising outcomes. However, achieving higher accuracy often comes at the cost of increased complexity, which reduces interpretability. This lack of transparency makes it challenging to deploy such models in real-world settings, as the reasoning behind their decisions remains unclear. We apply data engineering techniques to enable the use of non-text data for LLMs. A multi-attention mechanism is applied to this data to convert it into a vectorized form, which is more effective for training because the embedded data is generated by capturing relationships, dependencies, and connections of each instance to all other data. We then fine-tune bidirectional RoBERTa models using this embedded dataset. More broadly, this approach demonstrates how datasets designed for traditional ML models can be reused for LLMs, reducing the need to create new datasets for LLMs. Finally, we utilize model explainability methods such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) to determine which features are more or less significant, thereby clarifying the rationale behind a decision. This also considers aspects such as which attributes influence the detection of a cyberattack and to what extent.
Related links
Details
Title
Explaining Network Intrusion Detection System with SHAP and LIME
Publication Details
IEEE International Conference on Big Data, pp.4325-4332
Resource Type
Conference proceeding
Conference
IEEE International Conference on Big Data (BigData) (Macau, China, 12/08/2025–12/11/2025)
Publisher
IEEE
Grant note
5R42LM014356-03 / National Institutes of Health (NIH) (10.13039/100000002)
1946442,2421324,2433800 / National Science Foundation (NSF) (10.13039/100000001)