ALGORITHM 1 - Bagged and Voted Local Outlier Detection (BV LOF) |
Inputs:T (# of iterations (in other words ensembles size)) |
D = {X1, X2, ..., Xn} (the entire dataset), where n is the number of instances |
F = {F1, F2, ..., Fd} (feature set), where d is the dimension of the dataset |
OUTPUT:O = {o1, o2, ..., op} (a set of objects that are assigned as outliers) |
|
fori = 1 toTdo |
Randomly determine subset size R in [d/, d-1] |
forj = 1 toRdo |
ft = Randomly select a feature w/o replacement from F |
Si = Si ∪ ft |
end for |
Generate Di that includes the features in the subset Si |
foreach neighbor size kin [1, 100] do |
Apply LOF(k) on Di |
Obtain output vectors O(Di, k) |
end for |
foreach object oinO(Di, k) do |
// find highest total vote |
\matrix{{{h_i}(o)} {= {{\rm argmax}_{y \in Y}}\sum\nolimits_{k = 1}^{100} v} \hfill \cr \hfill {\kern 30pt} {{\rm where}\,Y = \{1, - 1\} \,{\rm and}\,v\left\{{\matrix{{({h_k}(o) = - 1) = 1} \hfill & {({\rm outlier})} \hfill \cr {({h_k}(o) = 1) = 0} \hfill &\,\,\,\, {({\rm inlier})} \hfill \cr}} \right.} \hfill} |
Obtain single output vector O(Di) for dataset Di |
end for |
O(D) = O(D) ∪ O(Di) |
end for |
foreach object oinO(D) do |
// find highest total vote |
\matrix{{h(o)} {= {{\rm argmax}_{y \in Y}}\sum\nolimits_{t = 1}^T v} \hfill \cr {\kern 30pt}{{\rm where}\,Y = \{1, - 1\} \,{\rm and}\,v\left\{{\matrix{{({h_t}(o) = - 1) = 1} \hfill & {({\rm outlier})} \hfill \cr {({h_t}(o) = 1) = 0} \hfill &\,\,\,\, {({\rm inlier})} \hfill \cr}} \right.} \hfill} |
Obtain a single output vector O representing all outliers in the dataset D |
end for |
Return O |
|
END ALGORITHM |