Computes bivariate statistics between a continuous variable and a set of variables, possibly according to a strata variable.

contab(x, y, strata = NULL, weights = NULL, robust = TRUE,
       digits = c(1,3), na.rm = TRUE, na.value = "NAs")

Arguments

x

data frame. The variables which are described in rows. They can be numerical or factors.

y

factor. The categorical variable which defines subgroups of observations described in columns.

strata

optional categorical variable to stratify the table by column. Default is NULL, which means no strata.

weights

numeric vector of weights. If NULL (default), uniform weights (i.e. all equal to 1) are used.

robust

logical. Whether to use medians (and mads) instead of means (and standard deviations). Default is TRUE.

digits

vector of 2 integers. The first value sets the number of digits for medians, mads, means and standard deviations (categorical variables). The second one sets the number of digits for slopes (continuous variables). Default is c(1,3). If NULL, the results are not rounded.

na.rm

logical, indicating whether NA values should be silently removed before the computation proceeds. If FALSE (default), an additional level is added to the categorical variables with NA values (see na.value argument).

na.value

character. Name of the level for NA category. Default is "NAs". Only used if na.rm = FALSE.

Details

For categorical variables in x, the function computes :

- column 1 : the median and the mad of y for each level of the variable

- column 2 : the global association between the variable and y, measured by the eta-squared

For continous variables in x, it computes :

- column 1 : the slope of the linear regression of y according to the variable

- column 2 : the global association between the variable and y, measured by Pearson and Spearman correlations

Value

An object of class gt_tbl.

Author

Nicolas Robette

Examples

data(Movies)
contab(x = Movies[, c("Genre", "ArtHouse", "Budget")],
       y = Movies$BoxOffice)