Prototypes of groups — Prototypes • moreparty

Prototypes are `representative' cases of a group of data points, given the similarity matrix among the points. They are very similar to medoids.

Prototypes(label, x, prox, nProto = 5, nNbr = floor((min(table(label)) - 1)/nProto))

Arguments

label: the response variable. Should be a factor.
x: matrix or data frame of predictor variables.
prox: the proximity (or similarity) matrix, assumed to be symmetric with 1 on the diagonal and in [0, 1] off the diagonal (the order of row/column must match that of x)
nProto: number of prototypes to compute for each value of the response variables.
nNbr: number of nearest neighbors used to find the prototypes.

Details

For each case in x, the nNbr nearest neighors are found. Then, for each class, the case that has most neighbors of that class is identified. The prototype for that class is then the medoid of these neighbors (coordinate-wise medians for numerical variables and modes for categorical variables). One then remove the neighbors used and iterate the first steps to find a second prototype, etc.

Value

A list of data frames with prototypes. The number of data frames is equal to the number of classes of the response variable.

References

Random Forests, by Leo Breiman and Adele Cutler https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#prototype

Author

Nicolas Robette

Note

The code is an extension of classCenter function in randomForest package.

Examples

  data(iris)
  iris2 = iris
  iris2$Species = factor(iris$Species == "versicolor")
  iris.cf = party::cforest(Species ~ ., data = iris2,
            control = party::cforest_unbiased(mtry = 2, ntree = 50))
  prox=proximity(iris.cf)
  Prototypes(iris2$Species,iris2[,1:4],prox)
#> $`FALSE`
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,] "6.7"        "2.8"       "5.8"        "1.9"      
#> [2,] "4.8"        "3"         "1.4"        "0.2"      
#> [3,] "5"          "3.4"       "1.4"        "0.2"      
#> [4,] "6.75"       "2.55"      "5.65"       "2.05"     
#> [5,] "4.7"        "3.2"       "1.4"        "0.2"      
#> 
#> $`TRUE`
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,] "5.5"        "2.4"       "3.7"        "1.1"      
#> [2,] "5.6"        "2.7"       "3.9"        "1.4"      
#> [3,] "5.9"        "2.7"       "4.2"        "1.4"      
#> [4,] "6.1"        "2.9"       "4.5"        "1.3"      
#> [5,] "6.3"        "2.9"       "4.6"        "1.4"      
#>