Benchmarking full version of GureKDDCup, UNSW-NB15, and CIDDS-001 NIDS datasets using rolling-origin resampling
Date
2021-10Author
Wong, Kok-Seng
Chew, Yee Jian
Ooi, Shih Yin
Pang, Ying Han
Lee, Nicolas
Metadata
Show full item recordAbstract
Network intrusion detection system (NIDS) is a system that analyses network traffic to flag malicious traffic or suspicious activities. Several recent NIDS datasets have been published, however, the lack of baseline experimental results on the full version of datasets had made it difficult for researchers to perform benchmarking. As the train-test distribution of the datasets has yet to be pre-defined by the creators, this further obstructs the researchers to compare the performance unbiasedly across each of the machine classifiers. Moreover, cross-validation resampling scheme has also been addressed in the literature to be inappropriate in the domain of NIDS. Thus, rolling-origin – a standard resampling technique which is also known as a common cross-validation scheme in the forecasting domain is employed to allocate the training and testing distributions. In this paper, rigorous experiments are conducted on the full version of the three recent NIDS datasets: GureKDDCup, UNSW-NB15, and CIDDS-001. While the datasets chosen might not be the latest available datasets, we have selected them as they include the essential IP addresses fields which are usually missing or removed due to some sort of privacy concerns. To deliver the baseline empirical results, 10 well-known classifiers from Weka are utilized.
Collections
- Kok-Seng Wong, PhD [16]