Benchmarking full version of GureKDDCup, UNSW-NB15, and CIDDS-001 NIDS datasets using rolling-origin resampling

Wong, Kok-Seng; Chew, Yee Jian; Ooi, Shih Yin; Pang, Ying Han; Lee, Nicolas

View/Open

Benchmarking full version of GureKDDCup, UNSW-NB15, and CIDDS-001 NIDS datasets using rolling-origin resampling.pdf (2.349Mb)

Date

2021-10

Author

Wong, Kok-Seng

Chew, Yee Jian

Ooi, Shih Yin

Pang, Ying Han

Lee, Nicolas

Metadata

Show full item record

Abstract

Network intrusion detection system (NIDS) is a system that analyses network traffic to flag malicious traffic or suspicious activities. Several recent NIDS datasets have been published, however, the lack of baseline experimental results on the full version of datasets had made it difficult for researchers to perform benchmarking. As the train-test distribution of the datasets has yet to be pre-defined by the creators, this further obstructs the researchers to compare the performance unbiasedly across each of the machine classifiers. Moreover, cross-validation resampling scheme has also been addressed in the literature to be inappropriate in the domain of NIDS. Thus, rolling-origin – a standard resampling technique which is also known as a common cross-validation scheme in the forecasting domain is employed to allocate the training and testing distributions. In this paper, rigorous experiments are conducted on the full version of the three recent NIDS datasets: GureKDDCup, UNSW-NB15, and CIDDS-001. While the datasets chosen might not be the latest available datasets, we have selected them as they include the essential IP addresses fields which are usually missing or removed due to some sort of privacy concerns. To deliver the baseline empirical results, 10 well-known classifiers from Weka are utilized.

URI

https://vinspace.edu.vn/handle/VIN/73

Collections

Kok-Seng Wong, PhD [19]