Measures the association between a categorical variable and some continuous and/or categorical variables

catdesc(y, x, weights=rep(1,length(y)), min.phi=NULL, 
robust=TRUE, nperm=NULL, distrib="asympt", dec=c(3,3,3,3,1,3))

Arguments

y

the categorical variable to describe (must be a factor)

x

a data frame with continuous and/or categorical variables

weights

an optional numeric vector of weights (by default, a vector of 1 for uniform weights)

min.phi

for the relationship between y and a categorical variable, only associations higher or equal to min.phi will be displayed. If NULL (default), they are all displayed.

robust

logical. If FALSE, mean and standard deviation are used instead of median and mad. Default is TRUE.

nperm

numeric. Number of permutations for the permutation test of independence. If NULL (default), no permutation test is performed.

distrib

the null distribution of permutation test of independence can be approximated by its asymptotic distribution ("asympt", default) or via Monte Carlo resampling ("approx").

dec

vector of 6 integers for number of decimals. The first value if for association measures, the second for permutation p-values, the third for percents, the fourth for phi coefficients, the fifth for medians and mads, the sixth for point biserial correlations. Default is c(3,3,3,3,1,3).

Value

A list of the following items :

variables

associations between y and the variables in x

bylevel

a list with one element for each level of y

Each element in bylevel has the following items :
categories

a data frame with categorical variables from x and associations measured by phi

continuous.var

a data frame with continuous variables from x and associations measured by correlation coefficients

References

Rakotomalala R., 'Comprendre la taille d'effet (effect size)', [http://eric.univ-lyon2.fr/~ricco/cours/slides/effect_size.pdf]

Author

Nicolas Robette

See also

Examples

data(Movies) catdesc(Movies$ArtHouse, Movies[,c("Budget","Genre","Country")])
#> $variables #> variable measure association permutation.pvalue #> 1 Genre Cramer's V 0.554 NA #> 2 Country Cramer's V 0.469 NA #> 3 Budget Eta2 0.181 NA #> #> $bylevel #> $bylevel$No #> $bylevel$No$categories #> categories pct.ycat.in.xcat pct.xcat.in.ycat pct.xcat.global phi #> 1 Country.USA 0.865 0.500 0.297 0.457 #> 5 Genre.Comedy 0.725 0.313 0.222 0.226 #> 6 Genre.Action 0.745 0.239 0.165 0.206 #> 7 Genre.SciFi 0.898 0.086 0.049 0.174 #> 8 Genre.Horror 1.000 0.049 0.025 0.156 #> 10 Genre.Animation 0.826 0.074 0.046 0.137 #> 12 Genre.Other 0.577 0.029 0.026 0.021 #> 13 Country.Europe 0.542 0.076 0.072 0.015 #> 16 Country.Other 0.231 0.012 0.026 -0.093 #> 18 Genre.ComDram 0.336 0.097 0.149 -0.149 #> 23 Genre.Documentary 0.117 0.018 0.077 -0.229 #> 24 Genre.Drama 0.203 0.095 0.241 -0.350 #> 25 Country.France 0.350 0.412 0.605 -0.405 #> #> $bylevel$No$continuous.var #> variables median.x.in.ycat median.x.global mad.x.in.ycat mad.x.global cor #> 1 Budget 18252352 7040471 12713051 5920656 0.426 #> #> #> $bylevel$Yes #> $bylevel$Yes$categories #> categories pct.ycat.in.xcat pct.xcat.in.ycat pct.xcat.global phi #> 2 Country.France 0.650 0.809 0.605 0.405 #> 3 Genre.Drama 0.797 0.395 0.241 0.350 #> 4 Genre.Documentary 0.883 0.140 0.077 0.229 #> 9 Genre.ComDram 0.664 0.204 0.149 0.149 #> 11 Country.Other 0.769 0.041 0.026 0.093 #> 14 Country.Europe 0.458 0.068 0.072 -0.015 #> 15 Genre.Other 0.423 0.023 0.026 -0.021 #> 17 Genre.Animation 0.174 0.016 0.046 -0.137 #> 19 Genre.Horror 0.000 0.000 0.025 -0.156 #> 20 Genre.SciFi 0.102 0.010 0.049 -0.174 #> 21 Genre.Action 0.255 0.086 0.165 -0.206 #> 22 Genre.Comedy 0.275 0.126 0.222 -0.226 #> 26 Country.USA 0.135 0.082 0.297 -0.457 #> #> $bylevel$Yes$continuous.var #> variables median.x.in.ycat median.x.global mad.x.in.ycat mad.x.global cor #> 2 Budget 2481788 7040471 1702911 5920656 -0.426 #> #> #>