Clustering-based Network Anomaly Detection


SunHee Baek

Document Type


Degree Name

Master of Science (MS)


Computer Science

Date of Award

Spring 2017


Identifying anomalous events in networks is one of the vital functions in enterprises, ISPs, and data centers to protect internal resources. With its importance, there has been a substantial body of work for network anomaly detection using supervised and unsupervised machine learning techniques which have their strengths and weaknesses. In this work, I take advantage of both unsupervised and supervised learning methods. The underlying process model I present in this thesis includes (i) normalization clustering, (ii) clustering the training dataset to create referential labels, (iii) building a supervised learning model with the automatically produced labels, and (iv) testing individual data points in question using the established learning model. By doing so, it is possible to construct a supervised learning model without the provision of the associated labels, which are often not available in practice. To achieve this, I set up a new property defining anomalies in the context of clustering, based on our observations from anomalous events in the network, wherein the referential labels can be obtained. Through our extensive experiments with a public dataset (NSL-KDD), I show that the presented method performs very well, yielding fairly comparable performance to the traditional method running with the original labels provided in the dataset with respect to the accuracy for anomaly detection.


Jinoh Kim

Subject Categories

Computer Sciences | Physical Sciences and Mathematics