Measures the association between a continuous variable and some continuous and/or categorical variables

condesc(y, x, weights = NULL, 
na.rm.cat = FALSE, na.value.cat = "NAs", na.rm.cont = FALSE,
limit = NULL, correlation = "kendall", robust = TRUE, 
nperm = NULL, distrib = "asympt", digits = 2)

Arguments

y

the continuous variable to describe

x

a data frame with continuous and/or categorical variables

weights

numeric vector of weights. If NULL (default), uniform weights (i.e. all equal to 1) are used.

na.rm.cat

logical, indicating whether NA values in the categorical variables should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the categorical variables (see na.value.cat argument).

na.value.cat

character. Name of the level for NA category. Default is "NAs". Only used if na.rm.cat = FALSE.

na.rm.cont

logical, indicating whether NA values in the continuous variables should be silently removed before the computation proceeds. Default is FALSE.

limit

for the relationship between y and a category of a categorical variable, only associations (point-biserial correlations) higher or equal to limit will be displayed. If NULL (default), they are all displayed.

correlation

character. The type of correlation measure to use between two continuous variables : "pearson", "spearman" or "kendall" (default).

robust

logical. If TRUE (default), meadian and mad are used instead of mean and standard deviation.

nperm

numeric. Number of permutations for the permutation test of independence. If NULL (default), no permutation test is performed.

distrib

the null distribution of permutation test of independence can be approximated by its asymptotic distribution ("asympt", default) or via Monte Carlo resampling ("approx").

digits

numeric. Number of digits for mean, median, standard deviation and mad. Default is 2.

Value

A list of the following items :

variables

associations between y and the variables in x

categories

a data frame with categorical variables from x and associations measured by point biserial correlation.

Note

If nperm is not NULL, permutation tests of independence are computed and the p-values from these tests are provided.

References

Rakotomalala R., 'Comprendre la taille d'effet (effect size)', [http://eric.univ-lyon2.fr/~ricco/cours/slides/effect_size.pdf]

Author

Nicolas Robette

Examples

data(Movies)
condesc(Movies$BoxOffice, Movies[,c("Budget","Genre","Country")])
#> $variables
#>   variable     measure association
#> 1    Genre        Eta2       0.173
#> 2  Country        Eta2       0.048
#> 3   Budget Kendall tau       0.518
#> 
#> $categories
#>           categories median.in.category overall.median mad.in.category
#> 1        Genre.SciFi           680900.0       107326.5        607448.0
#> 2    Genre.Animation           668896.0       107326.5        633265.0
#> 3        Country.USA           328559.0       107326.5        274892.0
#> 4       Genre.Action           240080.0       107326.5        202793.0
#> 5     Country.Europe           108121.5       107326.5        104606.5
#> 6       Genre.Comedy           202090.0       107326.5        191819.0
#> 7        Genre.Other           186084.5       107326.5        171739.5
#> 8      Country.Other            55643.0       107326.5         37068.5
#> 9       Genre.Horror           302635.0       107326.5        171803.0
#> 10 Genre.Documentary             9303.0       107326.5          7890.0
#> 11     Genre.ComDram            67341.0       107326.5         64317.0
#> 12       Genre.Drama            37160.0       107326.5         34368.0
#> 13    Country.France            57140.0       107326.5         54425.0
#>    overall.mad correlation
#> 1       104060       0.294
#> 2       104060       0.211
#> 3       104060       0.192
#> 4       104060       0.087
#> 5       104060       0.068
#> 6       104060       0.004
#> 7       104060      -0.005
#> 8       104060      -0.013
#> 9       104060      -0.015
#> 10      104060      -0.108
#> 11      104060      -0.114
#> 12      104060      -0.162
#> 13      104060      -0.211
#>