Anomaly Detection in a Data Center with a Reconstruction Method Using a Multi-Autoencoders Model

Early detection of anomalies in data centers is important to reduce downtimes and the costs of periodic maintenance. However, there is little research on this topic and even fewer on the fusion of sensor data for the detection of abnormal events. The goal of this paper is to propose a method for anomaly detection in data centers by combining sensor data (temperature, humidity, power) and deep learning models. The model described in the paper uses one autoencoder per sensor to reconstruct the inputs. The auto-encoders contain Long-Short Term Memory (LSTM) layers and are trained using the normal samples of the relevant sensors selected by correlation analysis. The difference signal between the input and its reconstruction is then used to classify the samples using feature extraction and a random forest classifier. The data measured by the sensors of a data center between January 2019 and May 2020 are used to train the model, while the data between June 2020 and May 2021 are used to assess it. Performances of the model are assessed a posteriori through F1-score by comparing detected anomalies with the data center’s history. The proposed model outperforms the state-of-the-art reconstruction method, which uses only one autoencoder taking multivariate sequences and detects an anomaly with a threshold on the reconstruction error, with an F1-score of 83.60% compared to 24.16%.

Dimensionality Reduction in Modal Analysis for Structural Health Monitoring

Autonomous structural health monitoring (SHM) of many structures and bridges became a topic of paramount importance for maintenance purposes and safety reasons. This paper proposes a set of machine learning (ML) tools to perform automatic feature selection and detection of anomalies in a bridge from vibrational data and compare different feature extraction schemes to increase the accuracy and reduce the amount of data collected. As a case study, the Z-24 bridge is considered because of the extensive database of accelerometric data in both standard and damaged conditions. The proposed framework starts from the first four fundamental frequencies extracted through operational modal analysis (OMA) and clustering, followed by time-domain filtering (tracking). The fundamental frequencies extracted are then fed to a dimensionality reduction block implemented through two different approaches: feature selection (intelligent multiplexer) that tries to estimate the most reliable frequencies based on the evaluation of some statistical features (i.e., entropy, variance, kurtosis), and feature extraction (auto-associative neural network (ANN)) that combine the fundamental frequencies to extract new damage sensitive features in a low dimensional feature space. Finally, one-class classification (OCC) algorithms perform anomaly detection, trained with standard condition points, and tested with normal and anomaly ones. In particular, principal component analysis (PCA), kernel principal component analysis (KPCA), and autoassociative neural network (ANN) are presented and their performance are compared. It is also shown that, by evaluating the correct features, the anomaly can be detected with accuracy and an F1 score greater than 95%.

Fast and Robust Long-term Tracking with Effective Searching Model

Kernelized Correlation Filter (KCF) based trackers have gained a lot of attention recently because of their accuracy and fast calculation speed. However, this algorithm is not robust in cases where the object is lost by a sudden change of direction, being obscured or going out of view. In order to improve KCF performance in long-term tracking, this paper proposes an anomaly detection method for target loss warning by analyzing the response map of each frame, and a classification algorithm for reliable target re-locating mechanism by using Random fern. Being tested with Visual Tracker Benchmark and Visual Object Tracking datasets, the experimental results indicated that the precision and success rate of the proposed algorithm were 2.92 and 2.61 times higher than that of the original KCF algorithm, respectively. Moreover, the proposed tracker handles occlusion better than many state-of-the-art long-term tracking methods while running at 60 frames per second.

Air Handling Units Power Consumption Using Generalized Additive Model for Anomaly Detection: A Case Study in a Singapore Campus

The emergence of digital twin technology, a digital replica of physical world, has improved the real-time access to data from sensors about the performance of buildings. This digital transformation has opened up many opportunities to improve the management of the building by using the data collected to help monitor consumption patterns and energy leakages. One example is the integration of predictive models for anomaly detection. In this paper, we use the GAM (Generalised Additive Model) for the anomaly detection of Air Handling Units (AHU) power consumption pattern. There is ample research work on the use of GAM for the prediction of power consumption at the office building and nation-wide level. However, there is limited illustration of its anomaly detection capabilities, prescriptive analytics case study, and its integration with the latest development of digital twin technology. In this paper, we applied the general GAM modelling framework on the historical data of the AHU power consumption and cooling load of the building between Jan 2018 to Aug 2019 from an education campus in Singapore to train prediction models that, in turn, yield predicted values and ranges. The historical data are seamlessly extracted from the digital twin for modelling purposes. We enhanced the utility of the GAM model by using it to power a real-time anomaly detection system based on the forward predicted ranges. The magnitude of deviation from the upper and lower bounds of the uncertainty intervals is used to inform and identify anomalous data points, all based on historical data, without explicit intervention from domain experts. Notwithstanding, the domain expert fits in through an optional feedback loop through which iterative data cleansing is performed. After an anomalously high or low level of power consumption detected, a set of rule-based conditions are evaluated in real-time to help determine the next course of action for the facilities manager. The performance of GAM is then compared with other approaches to evaluate its effectiveness. Lastly, we discuss the successfully deployment of this approach for the detection of anomalous power consumption pattern and illustrated with real-world use cases.

Design of an Ensemble Learning Behavior Anomaly Detection Framework

Data assets protection is a crucial issue in the cybersecurity field. Companies use logical access control tools to vault their information assets and protect them against external threats, but they lack solutions to counter insider threats. Nowadays, insider threats are the most significant concern of security analysts. They are mainly individuals with legitimate access to companies information systems, which use their rights with malicious intents. In several fields, behavior anomaly detection is the method used by cyber specialists to counter the threats of user malicious activities effectively. In this paper, we present the step toward the construction of a user and entity behavior analysis framework by proposing a behavior anomaly detection model. This model combines machine learning classification techniques and graph-based methods, relying on linear algebra and parallel computing techniques. We show the utility of an ensemble learning approach in this context. We present some detection methods tests results on an representative access control dataset. The use of some explored classifiers gives results up to 99% of accuracy.

Use of Hierarchical Temporal Memory Algorithm in Heart Attack Detection

In order to reduce the number of deaths due to heart problems, we propose the use of Hierarchical Temporal Memory Algorithm (HTM) which is a real time anomaly detection algorithm. HTM is a cortical learning algorithm based on neocortex used for anomaly detection. In other words, it is based on a conceptual theory of how the human brain can work. It is powerful in predicting unusual patterns, anomaly detection and classification. In this paper, HTM have been implemented and tested on ECG datasets in order to detect cardiac anomalies. Experiments showed good performance in terms of specificity, sensitivity and execution time.

Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine

Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.

Application of Building Information Modeling in Energy Management of Individual Departments Occupying University Facilities

To assist individual departments within universities in their energy management tasks, this study explores the application of Building Information Modeling in establishing the ‘BIM based Energy Management Support System’ (BIM-EMSS). The BIM-EMSS consists of six components: (1) sensors installed for each occupant and each equipment, (2) electricity sub-meters (constantly logging lighting, HVAC, and socket electricity consumptions of each room), (3) BIM models of all rooms within individual departments’ facilities, (4) data warehouse (for storing occupancy status and logged electricity consumption data), (5) building energy management system that provides energy managers with various energy management functions, and (6) energy simulation tool (such as eQuest) that generates real time 'standard energy consumptions' data against which 'actual energy consumptions' data are compared and energy efficiency evaluated. Through the building energy management system, the energy manager is able to (a) have 3D visualization (BIM model) of each room, in which the occupancy and equipment status detected by the sensors and the electricity consumptions data logged are displayed constantly; (b) perform real time energy consumption analysis to compare the actual and standard energy consumption profiles of a space; (c) obtain energy consumption anomaly detection warnings on certain rooms so that energy management corrective actions can be further taken (data mining technique is employed to analyze the relation between space occupancy pattern with current space equipment setting to indicate an anomaly, such as when appliances turn on without occupancy); and (d) perform historical energy consumption analysis to review monthly and annually energy consumption profiles and compare them against historical energy profiles. The BIM-EMSS was further implemented in a research lab in the Department of Architecture of NTUST in Taiwan and implementation results presented to illustrate how it can be used to assist individual departments within universities in their energy management tasks.

Anomaly Detection with ANN and SVM for Telemedicine Networks

In recent years, a wide variety of applications are developed with Support Vector Machines -SVM- methods and Artificial Neural Networks -ANN-. In general, these methods depend on intrusion knowledge databases such as KDD99, ISCX, and CAIDA among others. New classes of detectors are generated by machine learning techniques, trained and tested over network databases. Thereafter, detectors are employed to detect anomalies in network communication scenarios according to user’s connections behavior. The first detector based on training dataset is deployed in different real-world networks with mobile and non-mobile devices to analyze the performance and accuracy over static detection. The vulnerabilities are based on previous work in telemedicine apps that were developed on the research group. This paper presents the differences on detections results between some network scenarios by applying traditional detectors deployed with artificial neural networks and support vector machines.

Space Telemetry Anomaly Detection Based on Statistical PCA Algorithm

The critical concern of satellite operations is to ensure the health and safety of satellites. The worst case in this perspective is probably the loss of a mission, but the more common interruption of satellite functionality can result in compromised mission objectives. All the data acquiring from the spacecraft are known as Telemetry (TM), which contains the wealth information related to the health of all its subsystems. Each single item of information is contained in a telemetry parameter, which represents a time-variant property (i.e. a status or a measurement) to be checked. As a consequence, there is a continuous improvement of TM monitoring systems to reduce the time required to respond to changes in a satellite's state of health. A fast conception of the current state of the satellite is thus very important to respond to occurring failures. Statistical multivariate latent techniques are one of the vital learning tools that are used to tackle the problem above coherently. Information extraction from such rich data sources using advanced statistical methodologies is a challenging task due to the massive volume of data. To solve this problem, in this paper, we present a proposed unsupervised learning algorithm based on Principle Component Analysis (PCA) technique. The algorithm is particularly applied on an actual remote sensing spacecraft. Data from the Attitude Determination and Control System (ADCS) was acquired under two operation conditions: normal and faulty states. The models were built and tested under these conditions, and the results show that the algorithm could successfully differentiate between these operations conditions. Furthermore, the algorithm provides competent information in prediction as well as adding more insight and physical interpretation to the ADCS operation.

Outdoor Anomaly Detection with a Spectroscopic Line Detector

One of the tasks of optical surveillance is to detect anomalies in large amounts of image data. However, if the size of the anomaly is very small, limited information is available to distinguish it from the surrounding environment. Spectral detection provides a useful source of additional information and may help to detect anomalies with a size of a few pixels or less. Unfortunately, spectral cameras are expensive because of the difficulty of separating two spatial in addition to one spectral dimension. We investigate the possibility of modifying a simple spectral line detector for outdoor detection. This may be especially useful if the area of interest forms a line, such as the horizon. We use a monochrome CCD that also enables detection into the near infrared. A simple camera is attached to the setup to determine which part of the environment is spectrally imaged. Our preliminary results indicate that sensitive detection of very small targets is indeed possible. Spectra could be taken from the various targets by averaging columns in the line image. By imaging a set of lines of various widths we found narrow lines that could not be seen in the color image but remained visible in the spectral line image. A simultaneous analysis of the entire spectra can produce better results than visual inspection of the line spectral image. We are presently developing calibration targets for spatial and spectral focusing and alignment with the spatial camera. This will present improved results and more use in outdoor application.

Evaluating Performance of an Anomaly Detection Module with Artificial Neural Network Implementation

Anomaly detection techniques have been focused on two main components: data extraction and selection and the second one is the analysis performed over the obtained data. The goal of this paper is to analyze the influence that each of these components has over the system performance by evaluating detection over network scenarios with different setups. The independent variables are as follows: the number of system inputs, the way the inputs are codified and the complexity of the analysis techniques. For the analysis, some approaches of artificial neural networks are implemented with different number of layers. The obtained results show the influence that each of these variables has in the system performance.

Network Anomaly Detection using Soft Computing

One main drawback of intrusion detection system is the inability of detecting new attacks which do not have known signatures. In this paper we discuss an intrusion detection method that proposes independent component analysis (ICA) based feature selection heuristics and using rough fuzzy for clustering data. ICA is to separate these independent components (ICs) from the monitored variables. Rough set has to decrease the amount of data and get rid of redundancy and Fuzzy methods allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining- (KDDCup 1999) dataset.

Genetic-based Anomaly Detection in Logs of Process Aware Systems

Nowaday-s, many organizations use systems that support business process as a whole or partially. However, in some application domains, like software development and health care processes, a normative Process Aware System (PAS) is not suitable, because a flexible support is needed to respond rapidly to new process models. On the other hand, a flexible Process Aware System may be vulnerable to undesirable and fraudulent executions, which imposes a tradeoff between flexibility and security. In order to make this tradeoff available, a genetic-based anomaly detection model for logs of Process Aware Systems is presented in this paper. The detection of an anomalous trace is based on discovering an appropriate process model by using genetic process mining and detecting traces that do not fit the appropriate model as anomalous trace; therefore, when used in PAS, this model is an automated solution that can support coexistence of flexibility and security.

A Rule-based Approach for Anomaly Detection in Subscriber Usage Pattern

In this report we present a rule-based approach to detect anomalous telephone calls. The method described here uses subscriber usage CDR (call detail record) data sampled over two observation periods: study period and test period. The study period contains call records of customers- non-anomalous behaviour. Customers are first grouped according to their similar usage behaviour (like, average number of local calls per week, etc). For customers in each group, we develop a probabilistic model to describe their usage. Next, we use maximum likelihood estimation (MLE) to estimate the parameters of the calling behaviour. Then we determine thresholds by calculating acceptable change within a group. MLE is used on the data in the test period to estimate the parameters of the calling behaviour. These parameters are compared against thresholds. Any deviation beyond the threshold is used to raise an alarm. This method has the advantage of identifying local anomalies as compared to techniques which identify global anomalies. The method is tested for 90 days of study data and 10 days of test data of telecom customers. For medium to large deviations in the data in test window, the method is able to identify 90% of anomalous usage with less than 1% false alarm rate.

Anomaly Detection and Characterization to Classify Traffic Anomalies Case Study: TOT Public Company Limited Network

This paper represents four unsupervised clustering algorithms namely sIB, RandomFlatClustering, FarthestFirst, and FilteredClusterer that previously works have not been used for network traffic classification. The methodology, the result, the products of the cluster and evaluation of these algorithms with efficiency of each algorithm from accuracy are shown. Otherwise, the efficiency of these algorithms considering form the time that it use to generate the cluster quickly and correctly. Our work study and test the best algorithm by using classify traffic anomaly in network traffic with different attribute that have not been used before. We analyses the algorithm that have the best efficiency or the best learning and compare it to the previously used (K-Means). Our research will be use to develop anomaly detection system to more efficiency and more require in the future.

Anomaly Detection using Neuro Fuzzy system

As the network based technologies become omnipresent, demands to secure networks/systems against threat increase. One of the effective ways to achieve higher security is through the use of intrusion detection systems (IDS), which are a software tool to detect anomalous in the computer or network. In this paper, an IDS has been developed using an improved machine learning based algorithm, Locally Linear Neuro Fuzzy Model (LLNF) for classification whereas this model is originally used for system identification. A key technical challenge in IDS and LLNF learning is the curse of high dimensionality. Therefore a feature selection phase is proposed which is applicable to any IDS. While investigating the use of three feature selection algorithms, in this model, it is shown that adding feature selection phase reduces computational complexity of our model. Feature selection algorithms require the use of a feature goodness measure. The use of both a linear and a non-linear measure - linear correlation coefficient and mutual information- is investigated respectively

Fuzzy Hyperbolization Image Enhancement and Artificial Neural Network for Anomaly Detection

A prototype of an anomaly detection system was developed to automate process of recognizing an anomaly of roentgen image by utilizing fuzzy histogram hyperbolization image enhancement and back propagation artificial neural network. The system consists of image acquisition, pre-processor, feature extractor, response selector and output. Fuzzy Histogram Hyperbolization is chosen to improve the quality of the roentgen image. The fuzzy histogram hyperbolization steps consist of fuzzyfication, modification of values of membership functions and defuzzyfication. Image features are extracted after the the quality of the image is improved. The extracted image features are input to the artificial neural network for detecting anomaly. The number of nodes in the proposed ANN layers was made small. Experimental results indicate that the fuzzy histogram hyperbolization method can be used to improve the quality of the image. The system is capable to detect the anomaly in the roentgen image.

Health Assessment of Electronic Products using Mahalanobis Distance and Projection Pursuit Analysis

With increasing complexity in electronic systems there is a need for system level anomaly detection and fault isolation. Anomaly detection based on vector similarity to a training set is used in this paper through two approaches, one the preserves the original information, Mahalanobis Distance (MD), and the other that compresses the data into its principal components, Projection Pursuit Analysis. These methods have been used to detect deviations in system performance from normal operation and for critical parameter isolation in multivariate environments. The study evaluates the detection capability of each approach on a set of test data with known faults against a baseline set of data representative of such “healthy" systems.

Autonomously Determining the Parameters for SVDD with RBF Kernel from a One-Class Training Set

The one-class support vector machine “support vector data description” (SVDD) is an ideal approach for anomaly or outlier detection. However, for the applicability of SVDD in real-world applications, the ease of use is crucial. The results of SVDD are massively determined by the choice of the regularisation parameter C and the kernel parameter  of the widely used RBF kernel. While for two-class SVMs the parameters can be tuned using cross-validation based on the confusion matrix, for a one-class SVM this is not possible, because only true positives and false negatives can occur during training. This paper proposes an approach to find the optimal set of parameters for SVDD solely based on a training set from one class and without any user parameterisation. Results on artificial and real data sets are presented, underpinning the usefulness of the approach.