Measures the association between a categorical variable and a continuous variable, for each category of a group variable

assoc.catcont.by(x, y, by, weights = NULL,
                 na.rm.cat = FALSE, na.value.cat = "NA", na.rm.cont = FALSE,
                 nperm = NULL, distrib = "asympt", digits = 3)

Arguments

x

factor : the categorical variable

y

numeric vector : the continuous variable

by

factor : the group variable

weights

numeric vector of weights. If NULL (default), uniform weights (i.e. all equal to 1) are used.

na.rm.cat

logical, indicating whether NA values in the categorical variable (i.e. x) should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the categorical variable (see na.value.cat argument).

na.value.cat

character. Name of the level for NA category. Default is "NA". Only used if na.rm.cat = FALSE.

na.rm.cont

logical, indicating whether NA values in the continuous variable (i.e. y) should be silently removed before the computation proceeds. Default is FALSE.

nperm

numeric. Number of permutations for the permutation test of independence. If NULL (default), no permutation test is performed.

distrib

the null distribution of permutation test of independence can be approximated by its asymptotic distribution ("asympt", default) or via Monte Carlo resampling ("approx".

digits

integer. The number of digits (default is 3).

Value

A list of items, one for each category of the group variable. Each item is a list with the following elements :

summary

summary statistics (mean, median, etc.) of the continuous variable for each level of the categorical variable

eta.squared

eta-squared between the two variables

permutation.pvalue

p-value from a permutation (i.e. non-parametric) test of independence

cor

point biserial correlation between the two variables, for each level of the categorical variable

cor.perm.pval

permutation p-value of the correlation between the two variables, for each level of the categorical variable

test.values

test-values as proposed by Lebart et al (1984)

test.values.pval

p-values corresponding to the test-values

References

Rakotomalala R., 'Comprendre la taille d'effet (effect size)', [http://eric.univ-lyon2.fr/~ricco/cours/slides/effect_size.pdf]

Lebart L., Morineau A. and Warwick K., 1984, *Multivariate Descriptive Statistical Analysis*, John Wiley and sons, New-York.

Author

Nicolas Robette

Examples

data(Movies)
with(Movies, assoc.catcont.by(Country, Budget, ArtHouse, nperm = 10))
#> $summary
#> $summary$No
#>            mean       sd     min       q1   median       q3       max      mad
#> Europe 41671103 45507234 1225500 11846500 21242000 53513500 163400000 13072000
#> France 10424753 15243974   48614  3765522  6786520 12123204 181621894  3881661
#> Other  40713833 36884079 6536000 10212500 24510000 65768500 103759000 17157000
#> USA    45751512 41015270  306375 17974000 32680000 61275000 245100000 16340000
#> 
#> $summary$Yes
#>            mean       sd       min      q1   median       q3      max      mad
#> Europe  5672953  7578134   500.000 1225500  3300000  8170000 40850000  2575000
#> France  3482859  4691170 29500.000  931055  2090000  3999990 50052027  1425910
#> Other   6218257 13796434 40850.000 1260405  2532700  4493500 65360000  1429750
#> USA    18295864 16213239     0.817 6127500 11846500 30433250 57190000 10702700
#> 
#> 
#> $eta.squared
#> $eta.squared$No
#> [1] 0.208907
#> 
#> $eta.squared$Yes
#> [1] 0.2438707
#> 
#> 
#> $permutation.pvalue
#> $permutation.pvalue$No
#> [1] 0
#> 
#> $permutation.pvalue$Yes
#> [1] 0
#> 
#> 
#> $cor
#> $cor$No
#> Europe France  Other    USA 
#>  0.083 -0.456  0.029  0.399 
#> 
#> $cor$Yes
#> Europe France  Other    USA 
#>  0.023 -0.370  0.032  0.485 
#> 
#> 
#> $cor.perm.pval
#> $cor.perm.pval$No
#>       Europe       France        Other          USA 
#> 8.409384e-05 6.556805e-19 8.178375e-02 0.000000e+00 
#> 
#> $cor.perm.pval$Yes
#>       Europe       France        Other          USA 
#> 2.092070e-01 3.530154e-23 1.717604e-01 0.000000e+00 
#> 
#> 
#> $test.values
#> $test.values$No
#>      Europe      France       Other         USA 
#>   1.8813621 -10.3287728   0.6506543   9.0330668 
#> 
#> $test.values$Yes
#>     Europe     France      Other        USA 
#>  0.5127832 -8.1477652  0.6960400 10.6896674 
#> 
#> 
#> $test.values.pval
#> $test.values.pval$No
#>     Europe     France      Other        USA 
#> 0.05992267 0.00000000 0.51526969 0.00000000 
#> 
#> $test.values.pval$Yes
#>       Europe       France        Other          USA 
#> 6.081030e-01 4.440892e-16 4.864038e-01 0.000000e+00 
#> 
#>