This function displays the results of the variable selection process for each split of a conditional tree, i.e. the p-values from permutation tests of independence between every predictor and the dependent variable. This may help to assess the stability of the tree.

GetSplitStats(ct)

Arguments

ct

A tree of class constparty (as returned by ctree from partykit package).

Value

A list of two elements :

details

a list of data frames (one for each inner node), with one row per candidate variable, and test statistic and p-value of the permutation test of independence, criterion (equal to log(1-p)) and ratio (criterion/max(criterion) as columns. Variables are sorted by decreasing degree of association with the dependent variable.

summary

a data frame with one row per inner node and 5 variables : the mode id, the splitting variable, the best candidate to split among the other variables, the ratio of the criterion of the splitting variable divided by the criterion of the best variable among the others.

Details

The ratio index represents the ratio between the association test result for the splitting variable and the association test result for another candidate variable for splitting. It is always greater than 1. The closer it is to 1, the tighter the competition for the splitting variable, and therefore the more potentially unstable the node concerned. Conversely, the higher the ratio, the more the splitting variable has dominated the competition, and the more stable the node is likely to be.

References

Hothorn T, Hornik K, Van De Wiel MA, Zeileis A. "A lego system for conditional inference". The American Statistician. 60:257–263, 2006.

Hothorn T, Hornik K, Zeileis A. "Unbiased Recursive Partitioning: A Conditional Inference Framework". Journal of Computational and Graphical Statistics, 15(3):651-674, 2006.

Author

Nicolas Robette

Note

see also https://stats.stackexchange.com/questions/171301/interpreting-ctree-partykit-output-in-r

See also

ctree

Examples

  data(iris)
  iris2 = iris
  iris2$Species = factor(iris$Species == "versicolor")
  iris.ct = partykit::ctree(Species ~ ., data = iris2)
  GetSplitStats(iris.ct)
#> $details
#> $details$`1`
#>               statistic      p.value     criterion    ratio
#> Sepal.Width  32.5931798 4.544509e-08 -4.544509e-08        1
#> Petal.Length  6.0650240 5.402367e-02 -5.553773e-02  1222084
#> Petal.Width   2.0711194 4.782671e-01 -6.505996e-01 14316167
#> Sepal.Length  0.9392437 8.014468e-01 -1.616698e+00 35574759
#> 
#> $details$`2`
#>              statistic      p.value     criterion       ratio
#> Petal.Width  17.916506 9.232106e-05 -9.232533e-05     1.00000
#> Petal.Length 11.743513 2.439991e-03 -2.442972e-03    26.46048
#> Sepal.Length  4.123331 1.587488e-01 -1.728649e-01  1872.34587
#> Sepal.Width   1.086125 7.562178e-01 -1.411480e+00 15288.11407
#> 
#> $details$`5`
#>              statistic    p.value   criterion    ratio
#> Sepal.Width   8.514636 0.01401776 -0.01411693 1.000000
#> Petal.Length  7.800708 0.02072722 -0.02094505 1.483682
#> Petal.Width   4.917877 0.10215623 -0.10775920 7.633328
#> Sepal.Length  4.613277 0.12098913 -0.12895802 9.134987
#> 
#> 
#> $summary
#>   node   split_var     best_var   ratio
#> 1    1 Sepal.Width Petal.Length 1.2e+06
#> 2    2 Petal.Width Petal.Length   26.46
#> 3    5 Sepal.Width Petal.Length    1.48
#>