Call/WhatsApp: +1 332 209 4094

Techniques for combining multiple anomaly detection techniques

Techniques for combining multiple anomaly detection techniques

Discuss techniques for combining multiple anomaly detection techniques to improve the identification of anomalous objects. Consider both supervised and unsupervised cases.

Goods Studying has four preferred classes of apps: classification, projecting next significance, anomaly diagnosis, and obtaining design. One of them, Anomaly diagnosis picks up information and facts factors in particulars that will not fit and healthy well with all those other information. It offers a variety of apps like fraud breakthrough, checking, prognosis, specifics cleaning, and predictive regimen upkeep.

Although it has been studied in detail in academia, applications of anomaly detection have been limited to niche domains like banks, financial institutions, auditing, and medical diagnosis etc. However, with the advent of IoT, anomaly detection would likely to play a key role in IoT use cases such as monitoring and predictive maintenance.

This post explores what is anomaly detection, different anomaly detection techniques, discusses the key idea behind those techniques, and wraps up with a discussion on how to make use of those results.

However, often it is very hard to find training data, and even when you can find them, most anomalies are 1:1000 to 1:10^6 events where classes are not balanced. Moreover, the most data, such as data from IoT use cases, would be autocorrelated.

Another aspect is that the false positives are a major concern as we will discuss under acting on decisions. Hence, the precision ( given model predicted an anomaly, how likely it is to be true) and recall (how much anomalies the model will catch) trade-offs are different from normal classification use cases. We will discuss this in detail later.

Anomalies or outliers come in three types.

Point Anomalies. If an individual data instance can be considered as anomalous with respect to the rest of the data (e.g. purchase with large transaction value) Contextual Anomalies, If a data instance is anomalous in a specific context, but not otherwise ( anomaly if occur at a certain time or a certain region. e.g. large spike at the middle of the night) Collective Anomalies. If a collection of related data instances is anomalous with respect to the entire dataset, but not individual values. They have two variations. Events in unexpected order ( ordered. e.g. breaking rhythm in ECG) Unexpected value combinations ( unordered. e.g. buying a large number of expensive items) In the next section, we will discuss in detail how to handle the point and collective anomalies. Contextual anomalies are calculated by focusing on segments of data (e.g. spatial area, graphs, sequences, customer segment) and applying collective anomaly techniques within each segment independently.

Anomaly Detection Techniques Anomaly detection can be approached in many ways depending on the nature of data and circumstances. Following is a classification of some of those techniques.

Static Rules Approach The most simple, and maybe the best approach to start with, is using static rules. The Idea is to identify a list of known anomalies and then write rules to detect those anomalies. Rules identification is done by a domain expert, by using pattern mining techniques, or a by combination of both.

Fixed regulations are used with the hypothesis that anomalies stick to the 80/20 guideline where most anomalous incidences belong to handful of anomaly types. If the hypothesis is true, then we can detect most anomalies by finding few rules that describe those anomalies.

Implementing those rules can be done using one of three following methods.

If they are simple and no inference is needed, you can code them using your favorite programming language If decisions need inference, then you can use a rule-based or expert system (e.g. Drools) If decisions have temporal conditions, you can use a Complex Event Processing System (e.g. WSO2 CEP, Esper) Although simple, static rules-based systems tend to be brittle and complex. Furthermore, identifying those rules is often a complex and subjective task. Therefore, the statistical or machine learning-based approach, which automatically learns the general rules, is preferred to static rules.

Anomalies are rare under most conditions. Hence, even when training data is available, often there will be few dozen anomalies exists among millions of regular data points. The standard classification methods such as SVM or Random Forest will classify almost all data as normal because doing that will provide a very high accuracy score (e.g. accuracy is 99.9 if anomalies are one in thousand).

Generally, the class imbalance is solved using an ensemble built by resampling data many times. The idea is to first create new datasets by taking all anomalous data points and adding a subset of normal data points (e.g. as 4 times as anomalous data points). Then a classifier is built for each data set using SVM or Random Forest, and those classifiers are combined using ensemble learning. This approach has worked well and produced very good results.

If the data points are autocorrelated with each other, then simple classifiers would not work well. We handle those use cases using time series classification techniques or Recurrent Neural networks.

Unless you have education information, nevertheless it can be easy to do anomaly diagnosis employing unsupervised studying and semi-supervised understanding. However, after building the model, you will have no idea how well it is doing as you have nothing to test it against. Hence, the results of those methods need to be tested in the field before placing them in the critical path.

Point anomalies will only have one field in the data set. We use percentiles to detect point anomalies with numeric data and histograms to detect Detecting point anomalies in categorical data. Either case, we find rare data ranges or field values from the data and predict those as anomalies if it happens again. For example, if 99.9 percentile of my transaction value is 800$, one can guess any transaction greater than that value as the potential anomaly. When building models, often we use moving averages instead of point values when possible as they are much more stable to noise.

Time series data are the best examples of collective outliers in a univariate dataset. In this case, anomalies happen because values occur in unexpected order. For example. the third heart beat might be anomalous not because values are out of range, but they happen in a wrong order.