0% found this document useful (0 votes)

35 views68 pages

Anomoly Detection - Ensemble - Classifiers

The document discusses different techniques for anomaly detection in data including point, contextual and collective anomalies. It also covers unsupervised and supervised anomaly detection methods and specific techniques like isolation forest, local outlier factor, and DBSCAN. Key differences between anomaly and outlier, and categories of anomaly detection techniques are explained.

Uploaded by

33. Pushkal OJha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views68 pages

Anomoly Detection - Ensemble - Classifiers

Uploaded by

33. Pushkal OJha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

Anomaly Detection

Anomaly
Anomaly Detection

A data point significantly off the average and, depending on the

goal, either removing or resolving them from the analysis to
prevent skewing is known as outlier detection.
Outliers are most commonly caused by
➢ Dummy outliers created to test detection methods
➢ Data manipulation or data set unintended mutations
➢ Extracting or mixing data from wrong or various
sources
➢ Not an error, novelties in the data
➢ Data entry errors (human errors)
Difference Between Anomaly and Outlier
➢ An actual data point significantly outside a distribution’s mean
or median is an outlier.
➢ Extreme values in your data series are called outliers. They are
questionable. One student can be much more brilliant than other
students in the same class, and it is possible.

➢ An anomaly is a false data point made by a different process

than the rest of the data.
➢ Anomalies are unquestionably errors. For example, one million
degrees outside, or the air temperature won’t stay the same for
two weeks. As a result, you disregard this data.s
Types of Anomaly Detection
1. Point Anomaly
2. Contextual Anomaly
3. Collective Anomaly
Point Anomaly

A tuple within the dataset can be said as a Point

anomaly if it is far away from the rest of the data.

Example: An example of a point anomaly is a

sudden transaction of a huge amount from a credit
card.
Contextual Anomaly
Contextual anomaly is also known as conditional
outliers.

If a particular observation is different from other data

points, then it is known as a contextual Anomaly.

In such types of anomalies, an anomaly in one context

may not be an anomaly in another context.
Collective Anomaly
Collective anomalies occur when a data point
within a set is anomalous for the whole dataset,
and such values are known as collective outliers.

In such anomalies, specific or individual values

are not anomalous as a whole or contextually.
Categories of Anomaly detection
techniques
Anomaly detection techniques are broadly
categorized into two types:

1. Supervised Anomaly detection

2. Unsupervised Anomaly detection
Supervised Anomaly detection
➢ Supervised learning techniques use real-world input and output data
to detect anomalies.
➢ These types of anomaly detection systems require a data analyst to
label data points as either normal or abnormal to be used as training
data.
➢ A machine learning model trained with labeled data will be able to
detect outliers based on the examples it is given.
➢ This type of machine learning is useful in known outlier detection
but is not capable of discovering unknown anomalies or predicting
future issues.
Unsupervised Anomaly detection
➢ Unsupervised learning techniques do not require labeled data and can
handle more complex data sets.
➢ Find patterns from input data and make assumptions about what data
is perceived as normal.
Anomaly detection techniques
➢ Density-based algorithms determine when an outlier differs from a
larger, hence denser normal data set, using algorithms like K-
nearest neighbor and Isolation Forest.
➢ Cluster-based algorithms evaluate how any point differs
from clusters of related data using techniques like K-means cluster
analysis.
➢ Bayesian network algorithms develop models for estimating the
probability that events will occur based on related data and then
identifying significant deviations from these predictions.
➢ Neural network algorithms train a neural network to predict an
expected time series and then flag deviations.
1. Isolation forest
2. Local outlier factor
3. Robust covariance
4. One-Class support vector machine (SVM)
5. One-Class SVM with stochastic gradient descent
(SGD)
Inter Quartile Range (IQR)
• IQR measures variability by dividing the dataset into four equal
quartiles.
• First, the entire data is to be sorted in ascending order
• Then splitting it into four equal quartiles called Q1, Q2, Q3, and Q4,
which can be calculated using the following equation.
Z-score Method
• The Z-score of the values is the difference between that value and the
mean, divided by the standard deviation.
• Z-scores help identify outliers by values if a particular data point has
a Z-score value either less than -3 or greater than +3.
• Z score can be mathematically expressed as
Local Outliers Finder (LOF)
• Local Outlier Finder is an unsupervised machine learning technique to
detect outliers based on the closest neighborhood density of data
points and works well when the spread of the dataset (density) is not
the same.
• LOF considers K-distance (distance between the points) and K-
neighbors (set of points lies in the circle of K-distance (radius)).
• Lof considers two major parameters
(1) n_neighbors: The number of neighbors which has a default value of 20
(2) Contamination: the proportion of outliers in the given dataset which can be
set ‘auto’ or float values (0, 0.02, 0.005).
Density-Based Spatial Clustering for
Application with Noise (DBSCAN)
• multiple numeric features(multivariate)
• DBSCAN considers two main parameters to form a cluster with the
nearest data point and based on the high or low-density region,

(1) Epsilon (Radius of datapoint that we can calculate based on k-

distance graph)
(2) Min_samples (number of data points to be considered in the
Epsilon (radius) which depends on domain knowledge or expert
advice)
IQR is the simplest and most mathematically
explained technique.

It is good for univariate and bivariate data to

identify outliers as it considers the median as a
measure of dispersion to detect extreme values.

Limited to multivariate datasets while dealing

with huge numbers of numeric features.
Z-score measures how far raw data is from the
mean in the standard deviation unit and has an
advantage over its application at normally
distributed data sets.

when the dataset is not symmetric (left or right

skewed), then Z-score techniques may lead to
erroneous results.
LOF (local Outlier Factor) has an advantage when
data spread (density) is not uniformly distributed
throughout the space as it identifies the outliers
based on its proximity with neighboring dense
regions where other global methods find it
difficult.

However, explainability is an isssue as it is

difficult to say at what threshold a data point can
be considered an outlier.
DBSCAN does not require to be defined by
several clusters and can detect anomalies
where data spread is arbitrarily distributed
and linearly not separable.

It has its limitations while working with

varying density data spread.
Difference between
Supervised and Unsupervised
Learning
Regression analysis in ML
Introduction
➢Regression analysis is a statistical method to model the
relationship between dependent (target) and independent
(predictor) variables with one or more independent variables.
➢ More specifically, Regression analysis helps us to
understand how the value of the dependent variable changes
corresponding to an independent variable when other
independent variables are held fixed.
➢It predicts continuous/real values such as temperature, age,
salary, price, etc.
Terminologies Related to the Regression
Analysis
• Dependent Variable
• Independent Variable
• Outliers
• Multicollinearity
• If the independent variables are highly correlated with each other
than other variables, then such condition is called
Multicollinearity.
• It should not be present in the dataset, because it creates
problems while ranking the most affecting variable.
• Underfitting and Overfitting
Why do we use Regression Analysis?
➢ Regression estimates the relationship between the target
and the independent variable.
➢ It is used to find the trends in data.
➢ It helps to predict real/continuous values.
➢ By performing the regression, we can confidently
determine the most important factor, the least important
factor, and how each factor is affecting the other factors.
Ensemble Learning
Ensemble learning is a machine learning technique that
enhances accuracy and resilience in forecasting by
merging predictions from multiple models.

It aims to mitigate errors or biases that may exist in

individual models by leveraging the collective intelligence
of the ensemble
Combine the outputs of diverse models to create a more precise
prediction.

Ensemble learning improves the overall performance of the learning

system.

Enhances accuracy but also provides resilience against uncertainties in

the data

By effectively merging predictions from multiple models, ensemble

learning has proven to be a powerful tool in various domains, offering
more robust and reliable forecasts.
Simple Ensemble
Simple Techniques
Ensemble Techniques

Max Weighted
Averaging
Voting Averaging
Max Voting technique

➢ Used for classification problems

➢ Multiple models are used to make predictions for each data point
➢ Predictions by each model are considered as a ‘vote’.
➢ Predictions that we get from the majority of the models are used as the
final prediction.
Example
Averaging technique
Example
Weighted Averaging technique
Example
Advanced Ensemble techniques

Stacking Blending Bagging Boosting

Bagging
• Bagging or Bootstrap Aggregation is a parallel ensemble learning
technique to reduce the variance in the final prediction.
• similar to averaging, the only difference is that bagging uses random
sub-samples of the original dataset to train the same/multiple models
and then combines the prediction, whereas in averaging the same
dataset is used to train models.
• Also called Bootstrap Aggregation as it combines both
Bootstrapping (or Sampling of data) and Aggregation to form an
ensemble model.
In bagging, every base model is trained on a
different subset of data and all the results are
combined, so the final model is less overfitted
and variance is reduced.

Ex: Random Forest

Voting Technique :Hard and Soft
▪ 3 classifiers out of 5
voted for email
being not spam.

▪ On the other hand,

2 out of 5 voted the
email as spam

▪ most votes are the

final prediction –
NOT SPAM
Hard Voting
Soft Voting
It considers the probability scores of each class predicted by
individual models and averages them to produce a more
refined final prediction.

“Not spam” (0.502) > “spam” (0.498)

Final Prediction : “NOT SPAM”

Boosting

• Boosting is a sequential ensemble learning technique to convert weak base

learners to strong learners who perform better and are less biased.
• Boosting is an iterative method that adjusts the weight of an observation
based on the previous classification.
Types of boosting
➢ Adaptive boosting(AdaBoost)
▪ AdaBoost initially gives the same weight to each dataset.

▪ Then, it automatically adjusts the weights of the data points after every decision tree.

▪ It gives more weight to incorrectly classified items to correct them for the next round.

▪ It repeats until the residual error, or the difference between actual and predicted values,

falls below an acceptable threshold.

Gradient boosting
➢ Similar to AdaBoost

➢ GB does not give incorrectly classified items more weight.

➢ Optimizes the loss function by generating base learners sequentially so that the

present base learner is always more effective than the previous one

➢ Attempts to generate accurate results initially instead of correcting errors

throughout the process, like AdaBoost.

Extreme gradient boosting(XGBoost)
• Improves gradient boosting for computational speed and scale in several
ways.

• XGBoost uses multiple cores on the CPU so that learning can occur in parallel
during training.

• It is a boosting algorithm that can handle extensive datasets, making it

attractive for big data applications.

• The key features of XGBoost are parallelization, distributed computing, and

cache optimization
Benefits of boosting

➢ Ease of implementation

➢ Reduction of bias(presence of uncertainty)

➢ Computational efficiency
Stacking
A technique that combines multiple machine learning algorithms via
meta-learning (algorithm learns from another learning algorithm)
Hyperparameter Tuning
Using Grid Search and
Random Search in Python
• Hyperparameter optimization is a technique that involves
searching through a range of values to find a subset of
results that achieve the best performance on a given
dataset.
• Two popular techniques used to perform hyperparameter
optimization –
• Grid
• Random search.
Grid Search
➢ We first need to define a parameter space or parameter grid, where
we include a set of possible hyperparameter values that can be used
to build the model.
➢ Used to place these hyperparameters in a matrix-like structure, and
the model is trained on every combination of hyperparameter
values.
➢ The model with the best performance is then selected.
➢ All possible values selected
1. estimator – A scikit-learn model
2. param_grid – A dictionary with parameter names as keys and lists of
parameter values.
3. scoring – The performance measure. For example, ‘r2’ for regression
models, and ‘precision’ for classification models.
4. cv – An integer that is the number of folds for K-fold cross-validation.
Random Search
• Random samples from a grid of hyperparameters instead of
conducting an exhaustive search.
• Can specify the number of total runs the random search should try
before returning the best model.
rf_random = RandomizedSearchCV(rf, rs_space,
n_iter=500, scoring='accuracy’,
n_jobs=-1, cv=3)

model_random = rf_random.fit(X,y)

Outlier Detection Techniques
100% (2)
Outlier Detection Techniques
56 pages
Determination of Pepsin Digestability in Fish Meal 2000-1
0% (1)
Determination of Pepsin Digestability in Fish Meal 2000-1
26 pages
17 dm2 Anomaly Detection 2022 23
No ratings yet
17 dm2 Anomaly Detection 2022 23
113 pages
Michigan Education Finance Study
100% (1)
Michigan Education Finance Study
224 pages
Outlier Detection
No ratings yet
Outlier Detection
30 pages
Unit 5_Lecture 2_Statistical_Methods_Mining_Techniques
No ratings yet
Unit 5_Lecture 2_Statistical_Methods_Mining_Techniques
41 pages
LECTURE 12
No ratings yet
LECTURE 12
54 pages
Unit 5-2
No ratings yet
Unit 5-2
41 pages
Anomaly Detection: Lecture Notes For Chapter 9 Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Anomaly Detection: Lecture Notes For Chapter 9 Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
33 pages
8ad59658 1701235711480
No ratings yet
8ad59658 1701235711480
36 pages
Unit5
No ratings yet
Unit5
47 pages
12Outlier-1
No ratings yet
12Outlier-1
45 pages
Machine Learning for Anomaly Detection
No ratings yet
Machine Learning for Anomaly Detection
23 pages
741OutlierDetection
No ratings yet
741OutlierDetection
55 pages
07 OUTLIER DETECTION
No ratings yet
07 OUTLIER DETECTION
54 pages
Feature Engineering
No ratings yet
Feature Engineering
66 pages
Anderson PPT Ch03
No ratings yet
Anderson PPT Ch03
55 pages
5 Anomaly Detection Annotated Section 100 300 (3)
No ratings yet
5 Anomaly Detection Annotated Section 100 300 (3)
48 pages
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
No ratings yet
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
69 pages
Distance Based Outlier Detection
No ratings yet
Distance Based Outlier Detection
40 pages
Feature Engineering
No ratings yet
Feature Engineering
63 pages
Outlier Detection
No ratings yet
Outlier Detection
17 pages
Stat Methods
No ratings yet
Stat Methods
243 pages
Outlier Detection
No ratings yet
Outlier Detection
22 pages
Outlier Detection
No ratings yet
Outlier Detection
36 pages
Overview of Data Mining Process
No ratings yet
Overview of Data Mining Process
43 pages
Module_11(c)
No ratings yet
Module_11(c)
4 pages
4_Outliers_+Transformaations ML
No ratings yet
4_Outliers_+Transformaations ML
28 pages
Data Mining Slide Contents
No ratings yet
Data Mining Slide Contents
22 pages
Outlier Analysis
No ratings yet
Outlier Analysis
18 pages
Missing and Outlier
No ratings yet
Missing and Outlier
20 pages
Unit-5 Outlier Analysis
No ratings yet
Unit-5 Outlier Analysis
32 pages
Anomaly or Outlier Detection
No ratings yet
Anomaly or Outlier Detection
14 pages
ADS Ut2
No ratings yet
ADS Ut2
23 pages
outliers-EXTD
No ratings yet
outliers-EXTD
24 pages
System Identification: Arun K. Tangirala
No ratings yet
System Identification: Arun K. Tangirala
25 pages
Unit 3
No ratings yet
Unit 3
37 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
13 pages
Chapter 4 Part 2
No ratings yet
Chapter 4 Part 2
12 pages
CHAP 4 Multiple Choice
100% (1)
CHAP 4 Multiple Choice
103 pages
Anomaly Detection and Outlier Analysis
No ratings yet
Anomaly Detection and Outlier Analysis
25 pages
UNIT 4
No ratings yet
UNIT 4
17 pages
02 - 03 - Anomaly Detection Survey
No ratings yet
02 - 03 - Anomaly Detection Survey
27 pages
Dealing With Outliers
No ratings yet
Dealing With Outliers
19 pages
Outliers ML
No ratings yet
Outliers ML
14 pages
12Outlier
No ratings yet
12Outlier
16 pages
Krishnendu PCB-IT602B
No ratings yet
Krishnendu PCB-IT602B
11 pages
12 Outlier
No ratings yet
12 Outlier
55 pages
Bundle Adjustment - A Modern Synthesis: Bill - Triggs@
No ratings yet
Bundle Adjustment - A Modern Synthesis: Bill - Triggs@
71 pages
Data Minning Unit 4-1
No ratings yet
Data Minning Unit 4-1
10 pages
BITS-WASE-DATA MINING-Session-07-2015 PDF
No ratings yet
BITS-WASE-DATA MINING-Session-07-2015 PDF
25 pages
unit 4-2
No ratings yet
unit 4-2
7 pages
The Ultimate Guide To Anomaly Detection: Key Use Cases, Techniques, and Autoencoder Machine Learning Models
No ratings yet
The Ultimate Guide To Anomaly Detection: Key Use Cases, Techniques, and Autoencoder Machine Learning Models
9 pages
Feature Engineering
No ratings yet
Feature Engineering
15 pages
Estimating Animal Density With Camera Traps A Practitioners Guide of The REST Model
No ratings yet
Estimating Animal Density With Camera Traps A Practitioners Guide of The REST Model
40 pages
6anomaly Fraud Detection
No ratings yet
6anomaly Fraud Detection
5 pages
Anomoly detection
No ratings yet
Anomoly detection
2 pages
Chapter 3 Data Exploration
No ratings yet
Chapter 3 Data Exploration
84 pages
Be A 65 Ads Exp 7
No ratings yet
Be A 65 Ads Exp 7
7 pages
Lecture 8 Data Prepration Techniques
No ratings yet
Lecture 8 Data Prepration Techniques
4 pages
Minitab 16: ANOVA, Normality, Tukey, Control Charts
No ratings yet
Minitab 16: ANOVA, Normality, Tukey, Control Charts
63 pages
Model Quantization
No ratings yet
Model Quantization
48 pages
A Survey on Outlier Detection Methods
No ratings yet
A Survey on Outlier Detection Methods
4 pages
Unit II Data Science Notes
No ratings yet
Unit II Data Science Notes
38 pages
Anomaly Detection 2
No ratings yet
Anomaly Detection 2
8 pages
ISAT 600 Progress Report 3
No ratings yet
ISAT 600 Progress Report 3
4 pages
Techlog: Automatic Depth Shifting
100% (1)
Techlog: Automatic Depth Shifting
12 pages
Assessment: The Inventory of Parent and Peer Attachment-Revised (IPPA-R) For Children: A Psychometric Investigation
No ratings yet
Assessment: The Inventory of Parent and Peer Attachment-Revised (IPPA-R) For Children: A Psychometric Investigation
13 pages
Outlier
No ratings yet
Outlier
2 pages
Data Visualization - Real Estate
No ratings yet
Data Visualization - Real Estate
20 pages
SML
No ratings yet
SML
8 pages
Damage Detection of Pile Foundations in Marine Engineering Based On A Multidimensional Dynamic Signature
No ratings yet
Damage Detection of Pile Foundations in Marine Engineering Based On A Multidimensional Dynamic Signature
13 pages
Anomalies in dataset
No ratings yet
Anomalies in dataset
4 pages
Chapter12 Datahandling
No ratings yet
Chapter12 Datahandling
42 pages
Jas Et Al., 2017. Autoreject
No ratings yet
Jas Et Al., 2017. Autoreject
13 pages
Ramachandran Score
No ratings yet
Ramachandran Score
13 pages
meiassd2n
No ratings yet
meiassd2n
12 pages
A General and Adaptive Robust Loss Function: Jonathan T. Barron Google Research
No ratings yet
A General and Adaptive Robust Loss Function: Jonathan T. Barron Google Research
19 pages
Sullivan Section 3.4 Measures of Position and Outliers 1
No ratings yet
Sullivan Section 3.4 Measures of Position and Outliers 1
11 pages
FDS Question Paper-01
No ratings yet
FDS Question Paper-01
13 pages
Dsbda Viva Ans
No ratings yet
Dsbda Viva Ans
8 pages
Statistics Notes
No ratings yet
Statistics Notes
17 pages
Consistent Robust Analytical Approach For Outlier Detection in Multivariate Data Using Isolation Forest and Local Outlier Factor
No ratings yet
Consistent Robust Analytical Approach For Outlier Detection in Multivariate Data Using Isolation Forest and Local Outlier Factor
5 pages
Group A Assignment No2 Writeup
No ratings yet
Group A Assignment No2 Writeup
9 pages
IITM B.Sc. Qualifier Exam Revision
No ratings yet
IITM B.Sc. Qualifier Exam Revision
3 pages
Unsupervised Network Anomaly Detection
No ratings yet
Unsupervised Network Anomaly Detection
4 pages
10674 Ai11002 Introduction-To-data-science 2025 Spring Ct2
No ratings yet
10674 Ai11002 Introduction-To-data-science 2025 Spring Ct2
1 page
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Extending the Boundaries: An Expansive Journey into Nonparametric Curve Estimation
From Everand
Extending the Boundaries: An Expansive Journey into Nonparametric Curve Estimation
Pasquale De Marco
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet