specific MCA

Performs a specific Multiple Correspondence Analysis, i.e. a variant of MCA that allows to treat undesirable categories as passive categories.

speMCA(data, excl = NULL, ncp = 5, row.w = NULL)

Arguments

data: data frame with n rows (individuals) and p columns (categorical variables)
excl: numeric vector indicating the indexes of the "junk" categories (default is NULL). See getindexcat or use ijunk interactive function to identify these indexes. It may also be a character vector of junk categories, specified in the form "namevariable.namecategory" (for instance "gender.male").
ncp: number of dimensions kept in the results (default is 5)
row.w: an optional numeric vector of row weights. If NULL (default), a vector of 1 for uniform row weights)

Details

Undesirable (i.e. "junk") categories may be of several kinds: infrequent categories (say, <5 percents), heterogeneous categories (e.g. "others") or uninterpretable categories (e.g. "not available"). In these cases, specific MCA may be useful to ignore these categories for the determination of distances between individuals (see references).

If there are NAs in data, these NAs will be automatically considered as junk categories. If one desires more flexibility, data should be recoded to add explicit factor levels for NAs and then excl option may be used to select the junk categories.

Value

Returns an object of class speMCA, i.e. a list including:

eig: a list of vectors containing all the eigenvalues, the percentage of variance, the cumulative percentage of variance, the modified rates and the cumulative modified rates
call: a list with informations about input data
ind: a list of matrices containing the results for the individuals (coordinates, contributions, squared cosines and total distances)
var: a list of matrices containing all the results for the categories and variables (weights, coordinates, squared cosines, categories contributions to axes and cloud, test values (v.test), squared correlation ratio (eta2), variable contributions to axes and cloud, total distances

References

Le Roux B. and Rouanet H., Multiple Correspondence Analysis, SAGE, Series: Quantitative Applications in the Social Sciences, Volume 163, CA:Thousand Oaks (2010).

Le Roux B. and Rouanet H., Geometric Data Analysis: From Correspondence Analysis to Stuctured Data Analysis, Kluwer Academic Publishers, Dordrecht (June 2004).

Author

Nicolas Robette

Examples

# specific MCA of Music example data set
data(Music)
junk <- c("FrenchPop.NA", "Rap.NA", "Rock.NA", "Jazz.NA", "Classical.NA")
mca <- speMCA(Music[,1:5], excl = junk)
# This is equivalent to :
mca <- speMCA(Music[,1:5], excl = c(3,6,9,12,15))