Dr. Eman M El-Sheikh

Associate Vice President of the Center for Cybersecurity and Professor

Cybersecurity Education and Workforce Development

Leadership, Women and Diversity in Cybersecurity

application and evaluation for Cybersecurity and Education

Artificial Intelligence or Cybernetics

Cybersecurity

Higher Education

Intelligent Agents or Systems

Machine Learning

Knowledge-based systems

Intelligent tutoring systems

Software architectures

Journal article Open access Peer reviewed

Packet Inspection Transformer: A Self-Supervised Journey to Unseen Malware Detection with Few Samples

by Kyle Stein, Andrew Arash Mahyari, Guillermo Francia and Eman El-Sheikh

Published 2025

IEEE access, 13, 1

As networks continue to expand and become more interconnected, the need for novel malware detection methods becomes more pronounced. Traditional security measures are increasingly inadequate against the sophistication of modern cyber attacks. Deep Packet Inspection (DPI) has been pivotal in enhancing network security, offering an in-depth analysis of network traffic that surpasses conventional monitoring techniques. DPI not only examines the metadata of network packets, but also dives into the actual content being carried within the packet payloads, providing a comprehensive view of the data flowing through networks. While the integration of advanced deep learning techniques with DPI has introduced modern methodologies into malware detection and network traffic classification, state-of-the-art supervised learning approaches are limited by their reliance on large amounts of annotated data and their inability to generalize to novel, unseen malware threats. To address these limitations, this paper leverages the recent advancements in self-supervised learning (SSL) and few-shot learning (FSL). Our proposed self-supervised approach trains a transformer via SSL to learn the embeddings of packet content, including payload, from vast amounts of unlabeled data by masking portions of packets, leading to a learned representation that generalizes to various downstream tasks. Once the representation is extracted from the packets, they are used to train a malware detection algorithm. The representation obtained from the transformer is then used to adapt the malware detector to novel types of attacks using few-shot learning approaches. Our experimental results demonstrate that our method achieves classification accuracies of up to 94.76% on the UNSW-NB15 dataset and 83.25% on the CIC-IoT23 dataset.

Journal article Peer reviewed

Ontological Support for the Evolution of Future Services Oriented Architectures

by Bilal Gonen, Xingang Fang, Eman El-Sheikh, Sikha Bagui, Norman Wilde and Alfred Zimmermann

Published 12/31/2014

Transactions on Machine Learning and Artificial Intelligence, 2, 6, 77 - 90

Services Oriented Architectures (SOA) have emerged as a useful framework for developing interoperable, large-scale systems, typically implemented using the Web Services (WS) standards. However, the maintenance and evolution of SOA systems present many challenges. SmartLife applications are intelligent user-centered systems and a special class of SOA systems that present even greater challenges for a software maintainer. Ontologies and ontological modeling can be used to support the evolution of SOA systems. This paper describes the development of a SOA evolution ontology and its use to develop an ontological model of a SOA system. The ontology is based on a standard SOA ontology. The ontological model can be used to provide semantic and visual support for software maintainers during routine maintenance tasks. We discuss a case study to illustrate this approach, as well as the strengths and limitations.

Journal article Open access

Maintaining SOA Systems of the Future - How Can Ontological Modeling Help?

by Bilal Gonen, Xingang Fang, Eman El-Sheikh, Sikha Bagui, Norman Wilde, Alfred Zimmermann and Ilia Petrov

Published 2014

Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 0IC3K, 376 - 381

International Conference on Knowledge Engineering and Ontology Development IC3K, 10/21/2014–10/24/2014, Rome, Italy

Many future Services Oriented Architecture (SOA) systems may be pervasive SmartLife applications that provide real-time support for users in everyday tasks and situations. Development of such applications will be challenging, but in this position paper we argue that their ongoing maintenance may be even more so. Ontological modelling of the application may help to ease this burden, but maintainers need to understand a system at many levels, from a broad architectural perspective down to the internals of deployed components. Thus we will need consistent models that span the range of views, from business processes through system architecture to maintainable code. We provide an initial example of such a modelling approach and illustrate its application in a semantic browser to aid in software maintenance tasks.

Journal article Open access

A Machine Learning Tool for Weighted Regressions in Time, Discharge, and Season

by Alexander Maestre, Eman El-Sheikh, Derek Williamson and Amelia Ward

Published 01/01/2014

International journal of advanced computer science & applications, 5, 3, 99 - 106

A new machine learning tool has been developed to classify water stations with similar water quality trends. The tool is based on the statistical method, Weighted Regressions in Time, Discharge, and Season (WRTDS), developed by the United States Geological Survey (USGS) to estimate daily concentrations of water constituents in rivers and streams based on continuous daily discharge data and discrete water quality samples collected at the same or nearby locations. WRTDS is based on parametric survival regressions using a jack-knife cross validation procedure that generates unbiased estimates of the prediction errors. One of the disadvantages of WRTDS is that it needs a large number of samples (n > 200) collected during at least two decades. In this article, the tool is used to evaluate the use of Boosted Regression Trees (BRT) as an alternative to the parametric survival regressions for water quality stations with a small number of samples. We describe the development of the machine learning tool as well as an evaluation comparison of the two methods, WRTDS and BRT. The purpose of the tool is to evaluate the reduction in variability of the estimates by clustering data from nearby stations with similar concentration and discharge characteristics. The results indicate that, using clustering, the predicted concentrations using BRT are in general higher than the observed concentrations. In addition, it appears that BRT generates higher sum of square residuals than the parametric survival regressions.

Journal article Open access Peer reviewed

Towards Enhanced Program Comprehension for Service Oriented Architecture (SOA) Systems

by Eman El-Sheikh, Thomas Reichherzer, Laura White, Norman Wilde, John Coffey, Sikha Bagui, George Goehring and Arthur Baskin

Published 09/01/2013

Journal of software engineering and applications, 6, 9, 435 - 445

Service Oriented Architecture (SOA) is an emerging paradigm for orchestrating software components to build new composite applications that enable businesses, government agencies and other organizations to collaborate across institutional boundaries. SOA offers new languages and a variety of software development tools that enable software engineers to configure software as services and to interconnect services with other services independent of differences in operating platform and programming and communicating languages. However, SOA composite applications introduce additional complexity into the construction, deployment and maintenance of software, for the purpose of aggravating the issue of program comprehension, which is at the heart of software maintenance. This article describes the challenges in SOA program comprehension and reports on the results of a two-part case study aimed at identifying information that would help a SOA software maintainer. Analysis of the results indicates a need for higher-level abstractions and visualizations that can enhance conventional text-based search to support SOA program understanding. This paper then reports on several specific abstractions, visualization methods, and the development of an intelligent search tool to enhance comprehension of the relationships and data within a SOA composite application.

Journal article Open access Peer reviewed

A knowledge-based system approach for extracting abstractions from service oriented architecture artifacts

by George Goehring, Thomas Reichherzer, Eman El-Sheikh, Dallas Snider, Norman Wilde, Sikha Bagui, John Coffey and Laura J. White

Published 2013

International Journal of Advanced Research in Artificial Intelligence (IJARAI), 2, 45 - 52

Rule-based methods have traditionally been applied to develop knowledge-based systems that replicate expert performance on a deep but narrow problem domain. Knowledge engineers capture expert knowledge and encode it as a set of rules for automating the expert’s reasoning process to solve problems in a variety of domains. We describe the development of a knowledge-based system approach to enhance program comprehension of Service Oriented Architecture (SOA) software. Our approach uses rule-based methods to automate the analysis of the set of artifacts involved in building and deploying a SOA composite application. The rules codify expert knowledge to abstract information from these artifacts to facilitate program comprehension and thus assist Software Engineers as they perform system maintenance activities. A main advantage of the knowledge-based approach is its adaptability to the heterogeneous and dynamically evolving nature of SOA environments.

Journal article Peer reviewed

Development and use of AI and game applications in undergraduate computer science courses

by Eman M El-Sheikh and Lakshmi Prayaga

Published 12/2011

Journal of Computing Sciences in Colleges, 27, 2, 114 - 122

Gaming and Artificial Intelligence (AI) are both seen as exciting domains by many Computer Science students. Many universities are using these two areas as a means to attract and retain students in Computer Science through course work and research projects. In this paper we discuss the development of Artificial Intelligence and game applications by students in undergraduate game and AI programming courses, and how these applications can be integrated into Computer Science courses to improve student engagement and attainment of learning outcomes.

Journal article Open access Peer reviewed

CluSandra: A Framework and Algorithm for Data Stream Cluster Analysis

by Jose R. Fernandez and Eman M. El-Sheikh

Published 01/01/2011

International journal of advanced computer science & applications, 2, 11, 87 - 99

The clustering or partitioning of a dataset's records into groups of similar records is an important aspect of knowledge discovery from datasets. A considerable amount of research has been applied to the identification of clusters in very large multi-dimensional and static datasets. However, the traditional clustering and/or pattern recognition algorithms that have resulted from this research are inefficient for clustering data streams. A data stream is a dynamic dataset that is characterized by a sequence of data records that evolves over time, has extremely fast arrival rates and is unbounded. Today, the world abounds with processes that generate high-speed evolving data streams. Examples include click streams, credit card transactions and sensor networks. The data stream's inherent characteristics present an interesting set of time and space related challenges for clustering algorithms. In particular, processing time is severely constrained and clustering algorithms must be performed in a single pass over the incoming data. This paper presents both a clustering framework and algorithm that, combined, address these challenges and allows end-users to explore and gain knowledge from evolving data streams. Our approach includes the integration of open source products that are used to control the data stream and facilitate the harnessing of knowledge from the data stream. Experimental results of testing the framework with various data streams are also discussed.

Journal article Peer reviewed

Visual Interactive Artificial Neural Network (VIANN) Tutor

by Lakshmi Prayaga, Chandra S Prayaga, Sandra Simmons and Eman M El-Sheikh

Published 01/2010

Bulletin of Applied Computing & Information Technology, 7, 1

A Visual Interactive Artificial Neural Network (VIANN) Tutor has been designed in Visual Basic.net, which can be used as a teaching and self-learning tool for an introduction to neural networks. The tutor provides a simple graphical interface for the user to design a Hopfield type of network, and examine its evolution, starting from a prescribed initial state, and ending in a stable state or in a cycle of states mimicking a biological neural network in the brain. The user can change the network during runtime and observe changes in the evolution of the network. The interactive features allow the user to predict the evolution of the network and verify the prediction by running the model network. Initial evaluations of VIANN were conducted in an introductory Computer Science course (Excursions of Computing) open to all students and in an upper-level course in Artificial Intelligence. Responses of students to surveys administered after using the system showed that the tutorial was helpful to both beginner students, as well as those who were further along in the program. In
addition, it was interesting to both computer science students as well as students majoring in other disciplines. Feedback from the students indicated that such tutorials would also be very helpful in the other sciences.

Journal article Open access Peer reviewed

Discovering effective connectivity among brain regions from functional MRI data

by Carlos A Perez, Eman M El-Sheikh and Clark Glymour

Published 01/01/2010

International journal of computers in healthcare, 1, 1, 86 - 102

Functional magnetic resonance imaging (fMRI) data have been used for identifying brain regions that activate when a subject is presented a stimulus or performs a task. Beyond identifying which regions of the brain are active during a task, it is also of interest to discover causal relationships among activity in those regions, that is, which regions of the brain influence, which other regions of the brain during a task. Two algorithms for causal discovery were applied to fMRI data, the greedy equivalence search (GES) algorithm and the independent multiple-sample greedy equivalence search (iMAGES). GES applies to individual datasets, and iMAGES to multiple datasets. We consider the stability of the GES results across subjects and experimental repetitions with the same subject. We find that some iMAGES connections agree with previous knowledge of the functional roles of the brain regions. The strengths and limitations of the research work and opportunities for future work are also discussed.

Dr. Eman M El-Sheikh

Associate Vice President of the Center for Cybersecurity and Professor

List of works

University of West Florida Social media