Outliers.Rd
Computes outlierness scores and detects outliers.
Outliers(prox, cls=NULL, data=NULL, threshold=10)
a proximity matrix (a square matrix with 1 on the diagonal and values between 0 and 1 in the off-diagonal positions).
Factor. The classes the rows in the proximity matrix belong to. If NULL (default), all data are assumed to come from the same class.
A data frame of variables to describe the outliers (optional).
Numeric. The value of outlierness above which an observation is considered an outlier. Default is 10.
The outlierness score of a case is computed as n / sum(squared proximity), normalized by subtracting the median and divided by the MAD, within each class.
A list with the following elements :
numeric vector containing the outlierness scores
numeric vector of indexes of the outliers, or a data frame with the outliers and their characteristics
The code is adapted from outlier
function in randomForest
package.
data(iris)
iris2 = iris
iris2$Species = factor(iris$Species == "versicolor")
iris.cf = party::cforest(Species ~ ., data = iris2,
control = party::cforest_unbiased(mtry = 2, ntree = 50))
prox=proximity(iris.cf)
Outliers(prox, iris2$Species, iris2[,1:4])
#> $scores
#> 1 2 3 4 5 6
#> -0.759868093 -0.398265995 -0.658214419 -0.556174807 -0.759868093 -0.716988652
#> 7 8 9 10 11 12
#> -0.760467134 -0.760467134 0.587598728 -0.556174807 -0.750177865 -0.741403997
#> 13 14 15 16 17 18
#> -0.398265995 -0.398265995 -0.697888992 -0.683345714 -0.736670939 -0.759868093
#> 19 20 21 22 23 24
#> -0.676743670 -0.759868093 -0.731385744 -0.746325592 -0.759868093 -0.639749869
#> 25 26 27 28 29 30
#> -0.519403023 -0.347176475 -0.727663492 -0.759868093 -0.760467134 -0.635381846
#> 31 32 33 34 35 36
#> -0.529718944 -0.737300070 -0.759868093 -0.697888992 -0.556174807 -0.647483992
#> 37 38 39 40 41 42
#> -0.697888992 -0.759868093 -0.398265995 -0.760467134 -0.759868093 0.640937667
#> 43 44 45 46 47 48
#> -0.658214419 -0.385906914 -0.500453462 -0.398265995 -0.740740619 -0.658214419
#> 49 50 51 52 53 54
#> -0.759868093 -0.718993200 1.285039808 1.237550369 2.584707454 -0.427017322
#> 55 56 57 58 59 60
#> 1.146391714 -0.528093882 1.730405212 0.012874693 -0.012874693 0.595192629
#> 61 62 63 64 65 66
#> -0.018640221 0.725170319 -0.253536168 0.928625478 -0.687423173 0.828725711
#> 67 68 69 70 71 72
#> 0.909624285 -0.530465561 1.785774383 -0.649699038 12.625934451 -0.661558346
#> 73 74 75 76 77 78
#> 3.833558246 0.120541363 -0.295311655 0.520502736 2.275255242 6.187624062
#> 79 80 81 82 83 84
#> 0.787812223 -0.487024454 -0.539900798 -0.311534381 -0.747018196 4.751290309
#> 85 86 87 88 89 90
#> 1.340574474 2.090237679 1.119544085 -0.076461113 0.030913088 -0.590975246
#> 91 92 93 94 95 96
#> -0.606081596 0.579389213 -0.738733779 -0.018640221 -0.735161803 -0.028874594
#> 97 98 99 100 101 102
#> -0.773286140 -0.555059562 -0.438774404 -0.773286140 0.430930297 0.805486072
#> 103 104 105 106 107 108
#> 0.000000000 1.093704377 0.005521489 0.000000000 11.218768139 0.864708073
#> 109 110 111 112 113 114
#> 0.864708073 0.373211628 0.372606525 0.270344658 0.000000000 1.138257621
#> 115 116 117 118 119 120
#> 0.774764506 0.151526513 0.578954245 0.373211628 0.238577684 11.059166013
#> 121 122 123 124 125 126
#> 0.104333108 1.442714012 0.238577684 1.765007878 0.169892404 0.759062176
#> 127 128 129 130 131 132
#> 2.076862265 2.932056817 0.235692682 5.177557496 0.274431304 0.373211628
#> 133 134 135 136 137 138
#> 0.235692682 9.635256495 9.975009960 0.000000000 0.603417592 0.672237849
#> 139 140 141 142 143 144
#> 3.011239164 0.030009996 0.027196878 0.264392646 0.805486072 0.104333108
#> 145 146 147 148 149 150
#> 0.169892404 0.037850874 0.982433234 0.045961286 0.698449659 1.502789718
#>
#> $outliers
#> rowname Sepal.Length Sepal.Width Petal.Length Petal.Width scores
#> 1 71 5.9 3.2 4.8 1.8 12.62593
#> 2 107 4.9 2.5 4.5 1.7 11.21877
#> 3 120 6.0 2.2 5.0 1.5 11.05917
#>