Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features

Sikha Bagui; Xingang Fang; Ezhil Kalaimannan; Subhash C. Bagui; Joseph Sheehan

doi:10.1080/23742917.2017.1321891

Back

Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features

Journal article

Peer reviewed

Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features

Sikha Bagui, Xingang Fang, Ezhil Kalaimannan, Subhash C. Bagui and Joseph Sheehan

Journal of cyber security, Vol.1(2), pp.108-126

04/02/2017

DOI: https://doi.org/10.1080/23742917.2017.1321891

Metrics

34 Record Views

See more details

Abstract

Network traffic classification and characterisation is playing an increasingly vital role in understanding and solving security-related issues in internet-based applications. The priority of research studies in this area has focused on characterisation of network traffic based on various layers of communication protocols as outlined in the TCP/IP stack and even further expanded to concentrate on specific application-layer protocols. Virtual Private Networks (VPNs) have become one of the most popular remote access communication methods among users over the public internet and other Internet Protocol (IP)-based networks. VPNs are governed by IP Security, which is a suite of protocols used for tunnelling the already encrypted IP traffic, to guarantee secure remote access to servers. In this paper, we propose and develop a framework to classify VPN or non-VPN network traffic using time-related features. Our focus is on classification of network traffic which is encrypted, tunnelled through a VPN, and the one which is normally encrypted (non-VPN transmission), using machine-learning techniques on data sets of time-related features. Six classification models: logistic regression, support vector machine, Naïve Bayes, k-nearest neighbour and ensemble methods - the Random Forest (RF) classifier and Gradient Boosting Tree (GBT) classifiers - are compared, and recommendations of optimised RF and GBT models over other models are provided in terms of high accuracy and low overfitting. Features which contributed to achieve 90% accuracy in each category were also identified.

Details

Title: Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features
Publication Details: Journal of cyber security, Vol.1(2), pp.108-126
Resource Type: Journal article
Publisher: Taylor & Francis
Identifiers: 99380468460706600
Academic Unit: Hal Marcus College of Science and Engineering ; Computer Science; Mathematics and Statistics
Language: English

Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features

Metrics

Abstract

Related links

Details

University of West Florida Social media