Cluster Based Local Outlier Factor (CBLOF) : Use Squeezer Clustering Algorithm To Perform Clustering

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Outlier detection for Large dataset

Cluster based Local Outlier Factor(CBLOF)

Use squeezer clustering algorithm to perform clustering.


Initially, the first object is read into a newcluster.
For the consequent objects, the similarity between the object
and the existing cluster is measured.
If the max-similarity is > S add object to the cluster else add
new cluster.
Outlier detection for Large dataset

Cluster based Local Outlier Factor(CBLOF)

Definition-2
(large and small cluster).
The set of clusters in the sequence that |C1 | > |C2 |.... > |Ck |.
Given two numeric parameters α and β,
define b as the boundary of large and small cluster if one of
the following formulas holds.

(|C1 | + |C2 | + ... + |Cb | > |D| ∗ α (1)

|Cb |/|Cb+1 | > β (2)


set of large cluster LC = {Ci |i 6 b}
set of small cluster SC = {Cj |j > b}.
Outlier detection for Large dataset

Cluster based Local Outlier Factor(CBLOF)

Definition-3
(Cluster-based local outlier factor). For any object O, the
cluster-based local outlier factor of O is defined as:




 |Ci | ∗ min(distance(O, Cj )) where O ∈ Ci , Ci ∈ SC
and Cj ∈ LC




CBLOF (O) = for j = 1 to b

|Ci | ∗ (distance(O, Ci )) where O ∈ Ci





 and Ci ∈ LC
Outlier detection for Large dataset

FindCBLOF

Input: Data set D(A1 , ..., Am ) and parameters α and β


Output: Values of CBLOF for all records.
Steps
Clustering the Data set D, Producing {C = C1 , C2 , ..., Ck } and
|C1 | > |C2 |.... > |Ck |
Calculate CBLOF for object O
Greater the valve of CBLOF more the deviation.

You might also like