Cross-tabulation and measures of association between two categorical variables, for each category of a group variable

assoc.twocat.by(x, y, by, weights = NULL, na.rm = FALSE, na.value = "NAs",
                nperm = NULL, distrib = "asympt")

Arguments

x

factor : the first categorical variable

y

factor : the second categorical variable

by

factor : the group variable

weights

numeric vector of weights. If NULL (default), uniform weights (i.e. all equal to 1) are used.

na.rm

logical, indicating whether NA values should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the variables (see na.value argument).

na.value

character. Name of the level for NA category. Default is "NAs". Only used if na.rm = FALSE.

nperm

numeric. Number of permutations for the permutation test of independence. If NULL (default), no permutation test is performed.

distrib

the null distribution of permutation test of independence can be approximated by its asymptotic distribution (asympt, default) or via Monte Carlo resampling (approx).

Value

A list of items, one for each category of the group variable. Each item is a list of lists with the following elements :

tables list :

freq

cross-tabulation frequencies

prop

percentages

rprop

row percentages

cprop

column percentages

expected

expected values

global list :

chi.squared

chi-squared value

cramer.v

Cramer's V between the two variables

permutation.pvalue

p-value from a permutation (i.e. non-parametric) test of independence

global.pem

global PEM

GK.tau.xy

Goodman and Kruskal tau (forward association, i.e. x is the predictor and y is the response)

GK.tau.yx

Goodman and Kruskal tau (backward association, i.e. y is the predictor and x is the respons)

local list :

std.residuals

the table of standardized (i.e.Pearson) residuals.

adj.residuals

the table of adjusted standardized residuals.

adj.res.pval

the table of p-values of adjusted standardized residuals.

odds.ratios

the table of odds ratios.

local.pem

the table of local PEM

phi

the table of the phi coefficients for each pair of levels

phi.perm.pval

the table of permutation p-values for each pair of levels

gather : a data frame gathering informations, with one row per cell of the cross-tabulation.

Note

The adjusted standardized residuals are strictly equivalent to test-values for nominal variables as proposed by Lebart et al (1984).

References

Agresti, A. (2007). An Introduction to Categorical Data Analysis, 2nd ed. New York: John Wiley & Sons.

Rakotomalala R., Comprendre la taille d'effet (effect size), http://eric.univ-lyon2.fr/~ricco/cours/slides/effect_size.pdf

Lebart L., Morineau A. and Warwick K., 1984, *Multivariate Descriptive Statistical Analysis*, John Wiley and sons, New-York.

Author

Nicolas Robette

Examples

data(Movies)
assoc.twocat.by(Movies$Country, Movies$ArtHouse, Movies$Festival, nperm=100)
#> $tables
#> $tables$freq
#> $tables$freq$No
#>         No Yes Sum
#> Europe  38  27  65
#> France 212 344 556
#> Other    6  18  24
#> USA    249  29 278
#> Sum    505 418 923
#> 
#> $tables$freq$Yes
#>        No Yes Sum
#> Europe  1   6   7
#> France  0  49  49
#> Other   0   2   2
#> USA     8  11  19
#> Sum     9  68  77
#> 
#> 
#> $tables$prop
#> $tables$prop$No
#>                 No         Yes         Sum
#> Europe   4.1170098   2.9252438   7.0422535
#> France  22.9685807  37.2697725  60.2383532
#> Other    0.6500542   1.9501625   2.6002167
#> USA     26.9772481   3.1419285  30.1191766
#> Sum     54.7128927  45.2871073 100.0000000
#> 
#> $tables$prop$Yes
#>                No        Yes        Sum
#> Europe   1.298701   7.792208   9.090909
#> France   0.000000  63.636364  63.636364
#> Other    0.000000   2.597403   2.597403
#> USA     10.389610  14.285714  24.675325
#> Sum     11.688312  88.311688 100.000000
#> 
#> 
#> $tables$rprop
#> $tables$rprop$No
#>              No      Yes Sum
#> Europe 58.46154 41.53846 100
#> France 38.12950 61.87050 100
#> Other  25.00000 75.00000 100
#> USA    89.56835 10.43165 100
#> Sum    54.71289 45.28711 100
#> 
#> $tables$rprop$Yes
#>              No       Yes Sum
#> Europe 14.28571  85.71429 100
#> France  0.00000 100.00000 100
#> Other   0.00000 100.00000 100
#> USA    42.10526  57.89474 100
#> Sum    11.68831  88.31169 100
#> 
#> 
#> $tables$cprop
#> $tables$cprop$No
#>                No        Yes        Sum
#> Europe   7.524752   6.459330   7.042254
#> France  41.980198  82.296651  60.238353
#> Other    1.188119   4.306220   2.600217
#> USA     49.306931   6.937799  30.119177
#> Sum    100.000000 100.000000 100.000000
#> 
#> $tables$cprop$Yes
#>               No        Yes        Sum
#> Europe  11.11111   8.823529   9.090909
#> France   0.00000  72.058824  63.636364
#> Other    0.00000   2.941176   2.597403
#> USA     88.88889  16.176471  24.675325
#> Sum    100.00000 100.000000 100.000000
#> 
#> 
#> $tables$expected
#> $tables$expected$No
#>               No       Yes
#> Europe  35.56338  29.43662
#> France 304.20368 251.79632
#> Other   13.13109  10.86891
#> USA    152.10184 125.89816
#> 
#> $tables$expected$Yes
#>                No        Yes
#> Europe  0.8181818  6.1818182
#> France  5.7272727 43.2727273
#> Other   0.2337662  1.7662338
#> USA     2.2207792 16.7792208
#> 
#> 
#> 
#> $global
#> $global$chi.squared
#> $global$chi.squared$No
#> [1] 206.9385
#> 
#> $global$chi.squared$Yes
#> [1] 23.82577
#> 
#> 
#> $global$cramer.v
#> $global$cramer.v$No
#> [1] 0.4734998
#> 
#> $global$cramer.v$Yes
#> [1] 0.5562603
#> 
#> 
#> $global$permutation.pvalue
#> $global$permutation.pvalue$No
#> [1] 0
#> 
#> $global$permutation.pvalue$Yes
#> [1] 0
#> 
#> 
#> $global$global.pem
#> $global$global.pem$No
#> [1] 60.11803
#> 
#> $global$global.pem$Yes
#> [1] 96.42857
#> 
#> 
#> $global$GK.tau.xy
#> $global$GK.tau.xy$No
#> [1] 0.2242021
#> 
#> $global$GK.tau.xy$Yes
#> [1] 0.3094255
#> 
#> 
#> $global$GK.tau.yx
#> $global$GK.tau.yx$No
#> [1] 0.1572228
#> 
#> $global$GK.tau.yx$Yes
#> [1] 0.2062297
#> 
#> 
#> 
#> $local
#> $local$std.residuals
#> $local$std.residuals$No
#>                No        Yes
#> Europe  0.4085886 -0.4491008
#> France -5.2864732  5.8106349
#> Other  -1.9679122  2.1630336
#> USA     7.8568468 -8.6358648
#> 
#> $local$std.residuals$Yes
#>                 No         Yes
#> Europe  0.20100756 -0.07312724
#> France -2.39317211  0.87064424
#> Other  -0.48349378  0.17589670
#> USA     3.87807848 -1.41085828
#> 
#> 
#> $local$adj.residuals
#> $local$adj.residuals$No
#>                 No         Yes
#> Europe   0.6297326  -0.6297326
#> France -12.4579495  12.4579495
#> Other   -2.9630531   2.9630531
#> USA     13.9663200 -13.9663200
#> 
#> $local$adj.residuals$Yes
#>                No        Yes
#> Europe  0.2243363 -0.2243363
#> France -4.2230982  4.2230982
#> Other  -0.5213106  0.5213106
#> USA     4.7548724 -4.7548724
#> 
#> 
#> $local$adj.res.pval
#> $local$adj.res.pval$No
#>                No        Yes
#> Europe 0.52886957 0.52886957
#> France 0.00000000 0.00000000
#> Other  0.00304604 0.00304604
#> USA    0.00000000 0.00000000
#> 
#> $local$adj.res.pval$Yes
#>                  No          Yes
#> Europe 8.224956e-01 8.224956e-01
#> France 2.409667e-05 2.409667e-05
#> Other  6.021504e-01 6.021504e-01
#> USA    1.985718e-06 1.985718e-06
#> 
#> 
#> $local$odss.ratios
#> $local$odss.ratios$No
#> NULL
#> 
#> $local$odss.ratios$Yes
#> NULL
#> 
#> 
#> $local$local.pem
#> $local$local.pem$No
#>         y
#> x                No        Yes
#>   Europe   8.277512  -8.277512
#>   France -55.476318  55.476318
#>   Other  -54.306931  54.306931
#>   USA     76.965509 -76.965509
#> 
#> $local$local.pem$Yes
#>         y
#> x                 No         Yes
#>   Europe    2.941176   -2.941176
#>   France -100.000000  100.000000
#>   Other  -100.000000  100.000000
#>   USA      85.249042  -85.249042
#> 
#> 
#> $local$phi
#> $local$phi$No
#>                 No         Yes
#> Europe  0.02072790 -0.02072790
#> France -0.41005840  0.41005840
#> Other  -0.09753008  0.09753008
#> USA     0.45970702 -0.45970702
#> 
#> $local$phi$Yes
#>                 No         Yes
#> Europe  0.02556550 -0.02556550
#> France -0.48126671  0.48126671
#> Other  -0.05940885  0.05940885
#> USA     0.54186800 -0.54186800
#> 
#> 
#> $local$phi.perm.pval
#> $local$phi.perm.pval$No
#>                  No          Yes
#> Europe 2.324112e-01 2.324112e-01
#> France 1.822030e-41 0.000000e+00
#> Other  4.655131e-04 4.655131e-04
#> USA    0.000000e+00 4.078985e-52
#> 
#> $local$phi.perm.pval$Yes
#>                  No          Yes
#> Europe 3.672479e-01 3.672479e-01
#> France 1.562915e-05 1.562915e-05
#> Other  3.300849e-01 3.300849e-01
#> USA    4.481969e-06 4.481969e-06
#> 
#> 
#> 
#> $gather
#> $gather$No
#>   var.y  var.x freq        prop     rprop      cprop  expected std.residuals
#> 1    No Europe   38 0.041170098 0.5846154 0.07524752  35.56338     0.4085886
#> 2    No France  212 0.229685807 0.3812950 0.41980198 304.20368    -5.2864732
#> 3    No  Other    6 0.006500542 0.2500000 0.01188119  13.13109    -1.9679122
#> 4    No    USA  249 0.269772481 0.8956835 0.49306931 152.10184     7.8568468
#> 5   Yes Europe   27 0.029252438 0.4153846 0.06459330  29.43662    -0.4491008
#> 6   Yes France  344 0.372697725 0.6187050 0.82296651 251.79632     5.8106349
#> 7   Yes  Other   18 0.019501625 0.7500000 0.04306220  10.86891     2.1630336
#> 8   Yes    USA   29 0.031419285 0.1043165 0.06937799 125.89816    -8.6358648
#>   adj.residuals          or        pem         phi    perm.pval freq.x freq.y
#> 1     0.6297326  1.17836466   8.277512  0.02072790 2.324112e-01     65    505
#> 2   -12.4579495  0.15564727 -55.476318 -0.41005840 1.822030e-41    556    505
#> 3    -2.9630531  0.26720107 -54.306931 -0.09753008 4.655131e-04     24    505
#> 4    13.9663200 13.04700970  76.965509  0.45970702 0.000000e+00    278    505
#> 5    -0.6297326  0.84863373  -8.277512 -0.02072790 2.324112e-01     65    418
#> 6    12.4579495  6.42478327  55.476318  0.41005840 0.000000e+00    556    418
#> 7     2.9630531  3.74250000  54.306931  0.09753008 4.655131e-04     24    418
#> 8   -13.9663200  0.07664592 -76.965509 -0.45970702 4.078985e-52    278    418
#>       prop.x    prop.y
#> 1 0.07042254 0.5471289
#> 2 0.60238353 0.5471289
#> 3 0.02600217 0.5471289
#> 4 0.30119177 0.5471289
#> 5 0.07042254 0.4528711
#> 6 0.60238353 0.4528711
#> 7 0.02600217 0.4528711
#> 8 0.30119177 0.4528711
#> 
#> $gather$Yes
#>   var.y  var.x freq       prop     rprop      cprop   expected std.residuals
#> 1    No Europe    1 0.01298701 0.1428571 0.11111111  0.8181818    0.20100756
#> 2    No France    0 0.00000000 0.0000000 0.00000000  5.7272727   -2.39317211
#> 3    No  Other    0 0.00000000 0.0000000 0.00000000  0.2337662   -0.48349378
#> 4    No    USA    8 0.10389610 0.4210526 0.88888889  2.2207792    3.87807848
#> 5   Yes Europe    6 0.07792208 0.8571429 0.08823529  6.1818182   -0.07312724
#> 6   Yes France   49 0.63636364 1.0000000 0.72058824 43.2727273    0.87064424
#> 7   Yes  Other    2 0.02597403 1.0000000 0.02941176  1.7662338    0.17589670
#> 8   Yes    USA   11 0.14285714 0.5789474 0.16176471 16.7792208   -1.41085828
#>   adj.residuals          or         pem         phi    perm.pval freq.x freq.y
#> 1     0.2243363  1.29166667    2.941176  0.02556550 3.672479e-01      7      9
#> 2    -4.2230982  0.00000000 -100.000000 -0.48126671 1.562915e-05     49      9
#> 3    -0.5213106  0.00000000 -100.000000 -0.05940885 3.300849e-01      2      9
#> 4     4.7548724 41.45454545   85.249042  0.54186800 4.481969e-06     19      9
#> 5    -0.2243363  0.77419355   -2.941176 -0.02556550 3.672479e-01      7     68
#> 6     4.2230982         Inf  100.000000  0.48126671 1.562915e-05     49     68
#> 7     0.5213106         Inf  100.000000  0.05940885 3.300849e-01      2     68
#> 8    -4.7548724  0.02412281  -85.249042 -0.54186800 4.481969e-06     19     68
#>       prop.x    prop.y
#> 1 0.09090909 0.1168831
#> 2 0.63636364 0.1168831
#> 3 0.02597403 0.1168831
#> 4 0.24675325 0.1168831
#> 5 0.09090909 0.8831169
#> 6 0.63636364 0.8831169
#> 7 0.02597403 0.8831169
#> 8 0.24675325 0.8831169
#> 
#>