assoc.twocat.Rd
Cross-tabulation and measures of association between two categorical variables
assoc.twocat(x, y, weights = NULL, na.rm = FALSE, na.value = "NAs",
nperm = NULL, distrib = "asympt")
the first categorical variable (must be a factor)
the second categorical variable (must be a factor)
numeric vector of weights. If NULL (default), uniform weights (i.e. all equal to 1) are used.
logical, indicating whether NA values should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the variables (see na.value argument).
character. Name of the level for NA category. Default is "NAs". Only used if na.rm = FALSE.
numeric. Number of permutations for the permutation test of independence. If NULL (default), no permutation test is performed.
the null distribution of permutation test of independence can be approximated by its asymptotic distribution (asympt
, default) or via Monte Carlo resampling (approx
).
A list of lists with the following elements :
tables
list :
cross-tabulation frequencies
percentages
row percentages
column percentages
expected values
global
list :
chi-squared value
Cramer's V between the two variables
p-value from a permutation (i.e. non-parametric) test of independence
global PEM
Goodman and Kruskal tau (forward association, i.e. x is the predictor and y is the response)
Goodman and Kruskal tau (backward association, i.e. y is the predictor and x is the respons)
local
list :
the table of standardized (i.e. Pearson) residuals.
the table of adjusted standardized residuals.
the table of p-values of adjusted standardized residuals.
the table of odds ratios.
the table of local PEM
the table of the phi coefficients for each pair of levels
the table of permutation p-values for each pair of levels
gather
: a data frame gathering informations, with one row per cell of the cross-tabulation.
The adjusted standardized residuals are strictly equivalent to test-values for nominal variables as proposed by Lebart et al (1984).
Agresti, A. (2007). An Introduction to Categorical Data Analysis, 2nd ed. New York: John Wiley & Sons.
Rakotomalala R., Comprendre la taille d'effet (effect size), http://eric.univ-lyon2.fr/~ricco/cours/slides/effect_size.pdf
Lebart L., Morineau A. and Warwick K., 1984, *Multivariate Descriptive Statistical Analysis*, John Wiley and sons, New-York.
data(Movies)
assoc.twocat(Movies$Country, Movies$ArtHouse, nperm=100)
#> $tables
#> $tables$freq
#> No Yes Sum
#> Europe 39 33 72
#> France 212 393 605
#> Other 6 20 26
#> USA 257 40 297
#> Sum 514 486 1000
#>
#> $tables$prop
#> No Yes Sum
#> Europe 3.9 3.3 7.2
#> France 21.2 39.3 60.5
#> Other 0.6 2.0 2.6
#> USA 25.7 4.0 29.7
#> Sum 51.4 48.6 100.0
#>
#> $tables$rprop
#> No Yes Sum
#> Europe 54.16667 45.83333 100
#> France 35.04132 64.95868 100
#> Other 23.07692 76.92308 100
#> USA 86.53199 13.46801 100
#> Sum 51.40000 48.60000 100
#>
#> $tables$cprop
#> No Yes Sum
#> Europe 7.587549 6.790123 7.2
#> France 41.245136 80.864198 60.5
#> Other 1.167315 4.115226 2.6
#> USA 50.000000 8.230453 29.7
#> Sum 100.000000 100.000000 100.0
#>
#> $tables$expected
#> No Yes
#> Europe 37.008 34.992
#> France 310.970 294.030
#> Other 13.364 12.636
#> USA 152.658 144.342
#>
#>
#> $global
#> $global$chi.squared
#> [1] 220.1263
#>
#> $global$cramer.v
#> [1] 0.4691762
#>
#> $global$permutation.pvalue
#> [1] 0
#>
#> $global$global.pem
#> [1] 64.04814
#>
#> $global$GK.tau.xy
#> [1] 0.2201263
#>
#> $global$GK.tau.yx
#> [1] 0.1537807
#>
#>
#> $local
#> $local$std.residuals
#> No Yes
#> Europe 0.3274474 -0.3367479
#> France -5.6123445 5.7717531
#> Other -2.0143992 2.0716146
#> USA 8.4449945 -8.6848595
#>
#> $local$adj.residuals
#> No Yes
#> Europe 0.487584 -0.487584
#> France -12.809366 12.809366
#> Other -2.927844 2.927844
#> USA 14.447862 -14.447862
#>
#> $local$adj.res.pval
#> No Yes
#> Europe 0.625844564 0.625844564
#> France 0.000000000 0.000000000
#> Other 0.003413213 0.003413213
#> USA 0.000000000 0.000000000
#>
#> $local$odds.ratios
#> No Yes
#> Europe 1.1270813 0.8872474
#> France 0.1661190 6.0197809
#> Other 0.2751969 3.6337625
#> USA 11.1500000 0.0896861
#>
#> $local$local.pem
#> y
#> x No Yes
#> Europe 5.69273 -5.69273
#> France -51.55493 51.55493
#> Other -55.10326 55.10326
#> USA 72.28804 -72.28804
#>
#> $local$phi
#> No Yes
#> Europe 0.01541876 -0.01541876
#> France -0.40506773 0.40506773
#> Other -0.09258656 0.09258656
#> USA 0.45688150 -0.45688150
#>
#> $local$phi.perm.pval
#> No Yes
#> Europe 3.298231e-01 3.298231e-01
#> France 2.163444e-41 0.000000e+00
#> Other 4.122879e-04 4.122879e-04
#> USA 0.000000e+00 6.777456e-50
#>
#>
#> $gather
#> var.y var.x freq prop rprop cprop expected std.residuals
#> 1 No Europe 39 0.039 0.5416667 0.07587549 37.008 0.3274474
#> 2 No France 212 0.212 0.3504132 0.41245136 310.970 -5.6123445
#> 3 No Other 6 0.006 0.2307692 0.01167315 13.364 -2.0143992
#> 4 No USA 257 0.257 0.8653199 0.50000000 152.658 8.4449945
#> 5 Yes Europe 33 0.033 0.4583333 0.06790123 34.992 -0.3367479
#> 6 Yes France 393 0.393 0.6495868 0.80864198 294.030 5.7717531
#> 7 Yes Other 20 0.020 0.7692308 0.04115226 12.636 2.0716146
#> 8 Yes USA 40 0.040 0.1346801 0.08230453 144.342 -8.6848595
#> adj.residuals or pem phi perm.pval freq.x freq.y
#> 1 0.487584 1.1270813 5.69273 0.01541876 3.298231e-01 72 514
#> 2 -12.809366 0.1661190 -51.55493 -0.40506773 2.163444e-41 605 514
#> 3 -2.927844 0.2751969 -55.10326 -0.09258656 4.122879e-04 26 514
#> 4 14.447862 11.1500000 72.28804 0.45688150 0.000000e+00 297 514
#> 5 -0.487584 0.8872474 -5.69273 -0.01541876 3.298231e-01 72 486
#> 6 12.809366 6.0197809 51.55493 0.40506773 0.000000e+00 605 486
#> 7 2.927844 3.6337625 55.10326 0.09258656 4.122879e-04 26 486
#> 8 -14.447862 0.0896861 -72.28804 -0.45688150 6.777456e-50 297 486
#> prop.x prop.y
#> 1 0.072 0.514
#> 2 0.605 0.514
#> 3 0.026 0.514
#> 4 0.297 0.514
#> 5 0.072 0.486
#> 6 0.605 0.486
#> 7 0.026 0.486
#> 8 0.297 0.486
#>