Anomaly Detection on Broadband Network Gateway
Sosyal Medya

Anomaly Detection on Broadband Network Gateway

Abstract — Anomaly detection in Digital Subscriber Line (DSL) network is a vital task to immediately detect unusual network behavior caused by hackers, faulty hardware or software. Gener-ally, to make this process automatic, state-of-the-art methods use machine learning techniques to analyze data collected from either Customer Premises Equipment (CPE) or from devices in Access Network. In contrast to the existing methods, this paper utilizes network traffic data collected from multiple static Broadband Network Gateways (BNGs) which are core network devices at Network Service Provider (NSP). To automatically detect anoma-lies in BNG traffic data, a new framework is proposed which consists of three steps: data acquisition, feature extraction, and modeling anomalies using a random forest-based framework. The proposed method is compared with state-of-the-art methods and the results show the effectiveness and robustness of our method.

Index Terms — Anomaly detection, predictive maintenance, ma-chine learning, artificial intelligence, data analytic, communication networks, broadband network gateway, random forest, upsam-pling

Figure 1


DSL is one of the main access network technology for internet service providers and it is used to transfer digital signals via standard telephone lines and connections [1]. There may be various problems on DSL infrastructure such as data rate degradation, line failures, cyber attacks or power cuts that put many critical applications in risk. Generally, these prob-lems cause abnormality on normal network behaviours which decrease quality of service. In DSL network, internet traffic is transmitted through Broadband Network Gateways (BNGs) that are components of core network. An example of network topology with respect to Customer Premises Equipment (CPE), access network and core network is given in Figure 1. It is important to detect anomalies on BNG servers since these servers are used in context of authentication and authorization before accessing the internet. The anomalies on these servers can sign serious problems and affects the connectivity of many customers.

The anomalies in the DSL network can be classified into two categories: 1) network failures and performance problems and 2) attacks on the network that indicates security related problems [2]. In the literature, there are studies on both net-work failure [2]–[5], and security related problems [6]–[10]. Thottan [2] proposes a method based Auto Regressive (AR) mathematical model to detect network traffic anomalies. In this work, anomalies are detected using Management Information Base (MIB) variables that is specific to the individual networks. Marnerides et al. [3], proposes a scheme which is using Renyi entropy to detect network anomalies in DSL network. Here, DSL system logs gathered from all Digital Subscriber Line Access Multiplexers (DSLAMs) are used as dataset. In the DSL system logs, three types of anomaly events captured: 1) signal degradation, 2) power cut-off and 3) unknown reasons. Marnerides et al. [5] proposes a framework based on two class Support Vector Machine (SVM) algorithm to detect the signal degradation and power cut anomalies on DSLAMs. In this paper, DSL logs containing SyncTrap message are used. Ahmed et al. [6] focuses on developing a framework to detect anomalies in the network caused by Denial-of-Service (DoS) attacks. In this work, they use a variation of k-means clustering algorithm. Wyld et al. [7] studies the performance of different machine learning methods on detecting anomalies caused by Distributed Denial-of-Service (DDoS) attacks. In this manner, fuzzy C-means, Naive Bayes (NB), SVM, k-Nearest Neighbors (kNN), decision tree and k-means clustering algorithms are used to detect network anomalies.

Generally, the state-of-the-art methods use user plane data to analysis network behaviours. The main disadvantages of using such data in network anomaly detection problem are high cost and fault intolerant. Also more importantly, user plane data depends on user behaviour, so the system tends to generate more wrong predictions for normal situations. However, the core network data is more robust since it is less prone to the changes in the user behaviours. Therefore, with the help of aggregated data in the core network, operators are able to detect major problems that affect many customers. Consequently, in this paper BNG traffic data is used to study network behaviour.

Figure II

This study aims to identify DSL network anomalies with a Random Forest based approach by using the network traffic data that are gathered from the BNGs. Currently, network operation team experts are manually monitoring the traffic flow to detect problems on BNGs. However, manual inspection is slow and highly prone to error as there are various BNG devices with many ports in DSL networks. Therefore, in this paper, we propose a framework which automatizes entire process of analyzing BNG traffic to detect anomalies. To achieve this, firstly, internet traffic data is collected from BNGs. Then, a new feature extractor framework is proposed to extract seven different features from raw data to analysis network behaviour in hourly basis. Finally, to model the network anomaly, a balanced random forest strategy is utilized. The experimental results show the robustness and effectiveness of the proposed method.

The paper is organized as follows: Section II describes the proposed method to model network behaviour. The experimen-tal setups and results are given in Section III. The paper is concluded in Section IV.


In this study, a novel framework based on random forest is proposed to model the DSL network anomaly at core level. The framework consists of 3 steps including data collection, feature extraction, and modeling. In the first step, internet traffic flow is collected from static Broadband Network Gateways (BNGs). In the second step, a lookup metric table is constructed to represent the long and short term patterns of traffic flow taken from each static BNGs. Moreover, in this step, the raw data is converted to a more compact shape which includes hourly statistical summary traffic values of each devices. Then, a new feature extractor is proposed and used to extract features by utilizing the outputs of the above mentioned steps. Finally, the constructed dataset is manually labelled by an expert and then network anomaly is modeled using a random forest based technique. The main steps of proposed framework to model internet network behaviour at core level is given in Figure 2.

  1. Data Collecting

To detect the anomalies on DSL network, static BNG de-vices, located at core networks, are used to construct raw internet traffic dataset. Static BNG devices carry out authen-tication and authorization of users before accessing internet. General flow of accessing internet is simply depicted in Figure 3. Inconsistencies on traffic data may sign serious problems in DSL network. Hence, in order to detect the anomalies on DSL network, the network traffic of four months, starting from July 2019 to November 2019, is recorded. To construct this dataset, eleven static BNG devices (collector nodes) are considered. Note that these collector nodes are located in places where the network traffic is heavy.

The collected four months raw data consists of approximately 2.1 millions traffic flow records. It is composed of device name, slot number, port number, date, time, and total uploads and downloads. Note that the upload and download information include four different types of values such as bytes out and bytes in of uplink as well as bytes out and bytes in of downlink for each BNG devices. The raw traffic data is collected for each five minutes and described in terms of megabyte. In this paper, the sum of bytes in and bytes out of uplink is used to model DSL network anomalies.

Figure III

  1. Feature Extraction

To extract features from the raw data, a new feature extrac-tion framework is proposed. The proposed framework consists of two steps: 1) extracting compact features and 2) generating lookup metric table.

The raw dataset consists of internet traffic values of every five minutes collected from different slots and ports of the BNG devices. However, generally in NSP, network behaviour is analyzed in hourly basis. Therefore, in this work, to achieve hourly analyzes, all traffic values collected during one hour period are converted to a single representation using seven different feature extraction techniques. This is a vital task as normal and abnormal behaviour of the network has very complex patterns so that it is necessary to represent distinctive properties of input pattern and extract silent features of the data to make classification feasible. In this work, to extract features from the raw data, both long term and short term behaviours of network traffic is considered. Hence, six different statistical features are used for long term behaviour of network and one feature extraction technique is used for short term behaviour. To extract statistical features, mean (µlong), maximum (Max), minimum (Min), standard deviation (σ) as well as minimum and maximum Z-score (Zmin and Zmax) of traffic values of one hour period (x) for each device with all slots and ports are considered. To extract six statistical features from the raw traffic values, the following equations are used:

where N = 12 which is number of traffic values in one hour at a specific date and M is number of slots and ports in each device. To extract feature based on short term behaviour of network, a metric based on difference between average traffic values of current hour (t) and previous hour (t − 1) is used.

Consequently, compact features are used to represent the be-haviour of traffic of every BNG devices on every hour at each specific date (see Table I). More specifically, the compact features decreases the size of the raw data from 2.1 millions traffic data to 31773 traffic data. It is also important to note that the compact dataset is highly imbalanced as anomaly does not happen regularly. Therefore, in this dataset, the minority class is approximately 1.1% of the entire compact dataset, which makes 366 anomalies.

To generate lookup metric table, traffic patterns of all work-ing and nonworking days of each device are represented with six long term and one short term features. In this step, three different assumptions are considered. Firstly, BNG devices be-have independently and they have their own patterns. Secondly, each BNG device has different patterns at different hours of the day. Last but not least, it is important to separate working days and nonworking days since network behaves differently in these periods. For instance, Figure 4 shows average traffic values of one BNG device at different dates and times.

It is clear that, same device shows different behaviours in different hours of the working and nonworking days. Based on network behaviours, in this work, Sundays and national holidays are considered as nonworking days. To construct lookup table, same feature extraction techniques, given in Equation (1-7) are utilized. In these equations, x is one hour traffic values of all working days or nonworking days for each device with all slots and ports and N = 12 × w, where w is total number of working days or nonworking days. An example of lookup table representation is given in Table II. The lookup table consists of 528 network traffic data.

Finally, to extract features of traffic data, both compact and lookup features are used as a baseline. To achieve this, the ratio of difference between compact features and corresponding pattern value in the lookup features is calculated.

In equation (8), high values indicates high differences from normal behaviours and these are generally expected to be an anomaly. In order to give an insight, features of a small subset of our dataset can be seen in Table III without label information.

  1. Random Forest Classifier

To model network anomaly based on the constructed fea-tures, a strategy based on random forest is used. Random forest is one of the simplest and most effective methods used in various applications to solve both regression and classification problems [11]. To solve classification problems, random forest creates different decision trees on training data samples and gather the prediction results from each tree and selects the best solution by using a majority vote to predict the class labels. More specifically, random forest consists of many decision trees and each of these trees constructed over a random selection of training objects (observations) and a random extraction of the features. This simply decorrelates trees as every tree does not include all the observations or all the features so that random forest is less prone to over-fitting. Another advantage of random forest is its high efficiency on large datasets with many features. Besides of all these advantages, random forest also faces one major issue which is providing poor performance on extremely imbalanced training dataset which is also characteristics of our dataset. This is due to fact that this algorithm is designed to minimize the overall error rate, so that it focuses on prediction accuracy of the majority class, which results in poor prediction accuracy for the minority class. To solve this issue, there are two feasible solutions: 1) weighted random forest [12] and 2) balanced random forest [13]. In this work, a balanced random forest strategy based on SMOTE algorithm [14] is utilized. In the context of this paper, SMOTE algorithm [14] is used as a random upsampling algorithm to upsample the minority class. To achieve this, for each anomaly sample, the method determines n pairs of anomaly samples based on nearest neighbors rule, and then generate artificial samples using nearest neighbor interpolation strategy.


In this study, different methods based on outlier detection and supervised learning methods are used to evaluate the performance of the proposed method. In this manner, k-Nearest Neighboor (kNN) outlier detection and Isolation Forest methods are utilized as outlier detection whereas kNN classifier is used as supervised learning. In this paper, all the compared methods are based on the proposed feature extraction framework.

The constructed BNG dataset consists of 31773 samples in which 31407 samples belong to normal network behaviour and 366 samples dedicated to anomalies. The dataset is represented with 7 different features, where 80% of the datase is used for training and 20% for testing


In other words, this results in 25421 sample for training and 6352 samples for testing. Note that there are 293 anomalies in training dataset and 73 anomalies in test dataset. To balance this extremely imbalanced dataset for supervised approaches, SMOTE algorithm is used to upsample minority class in training dataset. Therefore, the minority class in training dataset is extended from 293 samples to 2000 samples. Moreover, to compare the performance of the SMOTE algorithm, the majority class in the training dataset is also downsampled by factor of two. Therefore, the proposed method and kNN classifier are trained using upsampled and downsampled BNG dataset to examine the performance of the models. To generate ground truth for both training and quantitative evaluation purposes, the internet traffic is manually labelled by a core network engineer expert.

We evaluate the proposed method against kNN outlier de-tection, Isolated Forest and kNN classifier. In outlier detection methods, one of the important parameters is contamination rate. The contamination rate is defined as total number of outliers in the dataset. In the experiments, it is set to 0.011 that is the rate of the anomalies in the BNG dataset. Isolation Forest also depends on other parameters such as maximum features, number of estimators and maximum samples which are empirically set as 3, 1000 and 2048, respectively. In the proposed method, the parameters such as number of estimators and maximum depth are set to 1000 and 6, respectively. For kNN outlier detection and kNN classifier, we set k parameter as 10 and 30, respectively. Note that both proposed method and kNN classifier are trained using upsampled and downsampled training dataset.

We calculate three different quantitative error measures to validate the results. For quantitative tests, the predicted result is compared with ground truth. The first test is based on precision. This error measure can be defined as the ratio of total number of correctly classified positive (anomalies) examples to the total number of predicted positive examples. The precision is calculated as P = TP / TP + FP , where TP is true positive and FP is false positive. True positive is the number of actual anomaly samples that are predicted as anomaly. False positive is the number of samples that are predicted as anomaly, however they actually are normal samples. The second test is based on recall which can be defined as the ratio of total true positives to total number of anomalies. The recall is calculated as R = TP / TP + FN, where FN is false negative which can be defined as the number of samples predicted as normal behaviour, but they actually are anomaly. The third test is F − score which is formulated as F − score = 2 P×R / P+R .

In order to generalize the success rate of the models, five-fold cross validation is used. In each fold, upsampling and downsampling techniques are only applied to data in train folds. The results of the different models are shown in Table IV. It shows that the proposed framework provides the highest performance with 64% precision and 83.3% recall values which bring approximately 72.4% F-score value. Moreover, this table demonstrates that the SMOTE upsampling strategy increases F-score of proposed method and kNN classifier by approximately 6% and 13%, respectively, when compared with downsam-pling strategy. The outlier detection methods provide the least anomaly detection rates. Amongst all, the best performance be-longs to proposed strategy whereas the worst result is provided by kNN outlier detection. The confusion matrix of one-fold of five-fold cross validation is presented in Table V. The confusion matrix shows that the proposed model provides high accuracy as T P , FN and F P are 64, 9, and 37, respectively.


This paper introduces a new framework to automatically detect DSL network anomalies. In the literature, the studies that are related with network performance, generally use user plane data. However, in this work, traffic data is collected from multiple BNG devices, which are part of core network. In this manner, firstly, the BNG traffic data is gathered from collector devices. Next, the collected raw data pass through feature extraction methods to construct compact data that represents statistical summary of traffic values for every hour as well as to construct lookup metric table. By using both lookup metric and compact table, seven different features are extracted. The imbalanced dataset which contains 31773 samples with 366 anomaly are upsampled using SMOTE algorithm. Finally, the data is modelled using balanced random forest classifier. The experimental results show that the proposed framework yields 64% precision, 83.3% recall and 72.4% F-measure.


We would like to thank Türk Telekom Research Center for providing data and sharing expert knowledge.


[1] A. Marnerides, S. Malinowski, R. Morla, and H. Kim, “Fault diagnosis in dsl networks using support vector machines,” Computer Communications, vol. 62, pp. 72 – 84, 2015.

[2] M. Thottan and C. Ji, “Anomaly detection in ip networks,” IEEE Trans-actions on Signal Processing, vol. 51, pp. 2191 – 2204, 2003.

[3] A. K. Marnerides, S. Malinowski, R. Morla, M. R. D. Rodrigues, and H. S. Kim, “Towards the improvement of diagnostic metrics fault diagnosis for dsl-based iptv networks using the r´enyi entropy,” in 2012 IEEE Global Communications Conference (GLOBECOM), 2012, pp. 2779–2784.

[4] R. A. Maxion and F. E. Feather, “A case study of ethernet anomalies in a distributed computing environment,” IEEE Transactions on Reliability, vol. 39, no. 4, pp. 433–443, 1990.

[5] A. K. Marnerides, S. Malinowski, R. Morla, M. R. D. Rodriguesz, and H. S. Kim, “On the comprehension of dsl synctrap events in iptv networks,” in 2013 IEEE Symposium on Computers and Communications (ISCC), 2013, pp. 670–675.

[6] M. Ahmed and A. N. Mahmood, “Network traffic analysis based on collective anomaly detection,” in 9th IEEE Conference on Industrial Electronics and Applications, 2014, pp. 1141–1146.

[7] D. C. Wyld, M. Wozniak, N. Chaki, N. Meghanathan, and D. Nagamalai, Advances in Network Security and Applications: 4th International Con-ference Proceedings, CNSA 2011, 1st ed. Springer Publishing Company, Incorporated, 2011.

[8] T. Shon, Y. Kim, C. Lee, and J. Moon, “A machine learning framework for network anomaly detection using SVM and GA,” in Sixth Annual IEEE SMC Information Assurance Workshop, 2005, pp. 176–183.

[9] K. Limthong and T. Tawsook, “Network traffic anomaly detection using machine learning approaches,” in 2012 IEEE Network Operations and Management Symposium, 2012, pp. 542–545.

[10] S. Chebrolu, A. Abraham, and J. P. Thomas, “Feature deduction and ensemble design of intrusion detection systems,” Computers Security, vol. 24, no. 4, pp. 295 – 307, 2005.

[11] G. Prashanth, V. Prashanth, J. Padmanabhan, and N. Srinivasan, “Using random forests for network-based anomaly detection at active routers,” in International Conference on Signal Processing, Communications and Networking, ICSCN ’08, 2008, pp. 93 – 96.

[12] N. Zerrouki, F. Harrou, Y. Sun, and L. Hocini, “A machine learning-based approach for land cover change detection using remote sensing and radiometric measurements,” IEEE Sensors Journal, vol. 19, no. 14, pp. 5843–5850, 2019.

[13] D. Feng, Z. Deng, T. Wang, Y. Liu, and L. Xu, “Identification of disturbance sources based on random forest model,” in 2018 International Conference on Power System Technology (POWERCON), 2018, pp. 3370–3375.

[14] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: Synthetic minority over-sampling technique,” J. Artif. Int. Res., vol. 16, no. 1, p. 321–357, 2002.