For a cross-tabulation, plots the observed (or expected) frequencies by using rectangles with proportional areas, and the measures of local association between the categories with a diverging gradient of colour, using ggplot2.

ggassoc_crosstab(data, mapping, size = "freq", max.size =  20,
                 measure = "phi", limits = NULL, sort = "none", 
                 na.rm = FALSE, na.value = "NA",
                 colors = NULL, direction = 1, legend = "right")

Arguments

data

dataset to use for plot

mapping

aesthetics being used. x and y are required, weight can also be specified.

size

character. If "freq" (default), areas are proportional to observed frequencies. If "expected", they are proportional to expected frequencies.

max.size

numeric value, specifying the maximum size of the squares. Default is 20.

measure

character. The measure of association used for filling the rectangles. Can be "phi" for phi coefficient (default), "or" for odds ratios, "std.residuals" for standardized residuals, "adj.residuals" for adjusted standardized residuals or "pem" for local percentages of maximum deviation from independence.

limits

a numeric vector of length two providing limits of the scale. If NULL (default), the limits are automatically adjusted to the data.

sort

character. If "both", rows and columns are sorted according to the first factor of a correspondence analysis of the contingency table. If "x", only rows are sorted. If "y", only columns are sorted. If "none" (default), no sorting is done.

na.rm

logical, indicating whether NA values should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the variables (see na.value argument).

na.value

character. Name of the level for NA category. Default is "NA". Only used if na.rm = FALSE.

colors

vector of colors that will be interpolated to produce a color gradient. If NULL (default), the "Temps" palette from rcartocolors package is used.

direction

Sets the order of colours in the scale. If 1, the default, colours are as output by RColorBrewer::brewer.pal(). If -1, the order of colours is reversed.

legend

the position of legend ("none", "left", "right", "bottom", "top"). If "none", no legend is displayed.

Details

The measure of local association measures how much each combination of categories of x and y is over/under-represented.

The areas of the rectangles are proportional to observed or expected frequencies. Their color shading varies according to the measure of association.

This function can be used as a high-level plot with ggduo and ggpairs functions of the GGally package.

Value

a ggplot object

Author

Nicolas Robette

Examples

data(Movies)
ggassoc_crosstab(data=Movies, mapping=ggplot2::aes(Genre, Country))