Performs a 'specific' Multiple Correspondence Analysis, i.e. a variant of MCA that allows to treat undesirable categories as passive categories.

speMCA(data, excl = NULL, ncp = 5, row.w = rep(1, times = nrow(data)))

Arguments

data

data frame with n rows (individuals) and p columns (categorical variables)

excl

numeric vector indicating the indexes of the "junk" categories (default is NULL). See getindexcat to identify these indexes. It may also be a character vector of junk categories, specified in the form "namevariable.namecategory" (for instance "gender.male").

ncp

number of dimensions kept in the results (default is 5)

row.w

an optional numeric vector of row weights (by default, a vector of 1 for uniform row weights)

Details

Undesirable categories may be of several kinds: infrequent categories (say, <5 percents), heterogeneous categories (e.g. 'others') or uninterpretable categories (e.g. 'not available'). In these cases, 'specific' MCA may be useful to ignore these categories for the determination of distances between individuals (see Le Roux and Rouanet, 2004 and 2010).

Value

Returns an object of class 'speMCA', i.e. a list including:

eig

a list of vectors containing all the eigenvalues, the percentage of variance, the cumulative percentage of variance, the modified rates and the cumulative modified rates

call

a list with informations about input data

ind

a list of matrices containing the results for the individuals (coordinates, contributions)

var

a list of matrices containing all the results for the categories and variables (weights, coordinates, square cosine, categories contributions to axes and cloud, test values (v.test), square correlation ratio (eta2), variable contributions to axes and cloud

References

Le Roux B. and Rouanet H., Multiple Correspondence Analysis, SAGE, Series: Quantitative Applications in the Social Sciences, Volume 163, CA:Thousand Oaks (2010).

Le Roux B. and Rouanet H., Geometric Data Analysis: From Correspondence Analysis to Stuctured Data Analysis, Kluwer Academic Publishers, Dordrecht (June 2004).

Author

Nicolas Robette

Examples

## Performs a specific MCA on 'Music' example data set
## ignoring every 'NA' (i.e. 'not available') categories.
data(Music)
getindexcat(Music[,1:5])
#>  [1] "FrenchPop.No"  "FrenchPop.Yes" "FrenchPop.NA"  "Rap.No"       
#>  [5] "Rap.Yes"       "Rap.NA"        "Rock.No"       "Rock.Yes"     
#>  [9] "Rock.NA"       "Jazz.No"       "Jazz.Yes"      "Jazz.NA"      
#> [13] "Classical.No"  "Classical.Yes" "Classical.NA" 
mca <- speMCA(Music[,1:5],excl=c(3,6,9,12,15))
str(mca)
#> List of 5
#>  $ eig :List of 5
#>   ..$ eigen    : num [1:10] 0.279 0.218 0.203 0.171 0.129 ...
#>   ..$ rate     : num [1:10] 27.5 21.5 20.1 16.9 12.7 ...
#>   ..$ cum.rate : num [1:10] 27.5 49 69.1 86 98.7 ...
#>   ..$ mrate    : num [1:3] 94.86 4.98 0.15
#>   ..$ cum.mrate: num [1:3] 94.9 99.8 100
#>  $ call:List of 8
#>   ..$ X        :'data.frame':	500 obs. of  5 variables:
#>   .. ..$ FrenchPop: Factor w/ 3 levels "No","Yes","NA": 3 1 1 1 2 2 1 2 1 1 ...
#>   .. ..$ Rap      : Factor w/ 3 levels "No","Yes","NA": 1 2 1 1 1 1 1 1 1 2 ...
#>   .. ..$ Rock     : Factor w/ 3 levels "No","Yes","NA": 2 1 2 2 1 2 1 2 1 1 ...
#>   .. ..$ Jazz     : Factor w/ 3 levels "No","Yes","NA": 1 3 1 1 1 1 1 1 1 1 ...
#>   .. ..$ Classical: Factor w/ 3 levels "No","Yes","NA": 1 1 1 1 2 1 1 1 1 3 ...
#>   ..$ marge.col: Named num [1:10] 0.0776 0.1204 0.1668 0.0308 0.144 ...
#>   .. ..- attr(*, "names")= chr [1:10] "FrenchPop.No" "FrenchPop.Yes" "Rap.No" "Rap.Yes" ...
#>   ..$ marge.row: Named num [1:500] 4e-04 4e-04 4e-04 4e-04 4e-04 4e-04 4e-04 4e-04 4e-04 4e-04 ...
#>   .. ..- attr(*, "names")= chr [1:500] "1" "2" "3" "4" ...
#>   ..$ ncp      : num 5
#>   ..$ quali    : int [1:5] 1 2 3 4 5
#>   ..$ excl     : num [1:5] 3 6 9 12 15
#>   ..$ excl.char: chr [1:5] "FrenchPop.NA" "Rap.NA" "Rock.NA" "Jazz.NA" ...
#>   ..$ row.w    : num [1:500] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ ind :List of 2
#>   ..$ coord  : num [1:500, 1:5] 0.0308 0.5645 0.0824 0.0824 -0.2774 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:500] "2124" "4485" "4731" "1411" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>   ..$ contrib: num [1:500, 1:5] 0.000682 0.228793 0.00488 0.00488 0.05525 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:500] "2124" "4485" "4731" "1411" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>  $ var :List of 9
#>   ..$ weight    : Named num [1:10] 194 301 417 77 360 135 395 95 351 142
#>   .. ..- attr(*, "names")= chr [1:10] "FrenchPop.No" "FrenchPop.Yes" "Rap.No" "Rap.Yes" ...
#>   ..$ coord     : num [1:10, 1:5] 0.1362 -0.0839 -0.1178 0.6421 0.2479 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:10] "FrenchPop.No" "FrenchPop.Yes" "Rap.No" "Rap.Yes" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>   ..$ contrib   : num [1:10, 1:5] 0.517 0.304 0.831 4.559 3.177 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:10] "FrenchPop.No" "FrenchPop.Yes" "Rap.No" "Rap.Yes" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>   ..$ ctr.cloud :'data.frame':	10 obs. of  1 variable:
#>   .. ..$ ctr.cloud: num [1:10] 12.24 7.96 3.32 16.92 5.6 ...
#>   ..$ cos2      : num [1:10, 1:5] 0.0118 0.0106 0.0697 0.0751 0.158 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:10] "FrenchPop.No" "FrenchPop.Yes" "Rap.No" "Rap.Yes" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>   ..$ v.test    : num [1:10, 1:5] 2.42 -2.3 -5.9 6.12 8.88 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:10] "FrenchPop.No" "FrenchPop.Yes" "Rap.No" "Rap.Yes" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>   ..$ eta2      : num [1:5, 1:5] 0.012 0.0751 0.1595 0.6108 0.5392 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:5] "FrenchPop" "Rap" "Rock" "Jazz" ...
#>   .. .. ..$ : chr [1:5] "dim.1" "dim.2" "dim.3" "dim.4" ...
#>   ..$ v.contrib :'data.frame':	5 obs. of  5 variables:
#>   .. ..$ dim.1: num [1:5] 0.821 5.391 11.407 43.858 38.524
#>   .. ..$ dim.2: num [1:5] 0.111 48.293 41.755 2.799 7.041
#>   .. ..$ dim.3: num [1:5] 88.83 2.27 4.33 3.18 1.38
#>   .. ..$ dim.4: num [1:5] 9.01 43.07 31.99 1.47 14.47
#>   .. ..$ dim.5: num [1:5] 1.214 0.971 10.515 48.708 38.591
#>   ..$ vctr.cloud:'data.frame':	5 obs. of  1 variable:
#>   .. ..$ vctr.cloud: num [1:5] 20.2 20.2 20.2 20.4 20.3
#>  $ svd :List of 3
#>   ..$ vs: num [1:10] 0.528 0.467 0.451 0.414 0.359 ...
#>   ..$ U : num [1:500, 1:10] 0.00261 0.04783 0.00699 0.00699 -0.02351 ...
#>   ..$ V : num [1:10, 1:10] 0.0719 -0.0551 -0.0912 0.2135 0.1782 ...
#>  - attr(*, "class")= chr [1:2] "speMCA" "list"
# This is equivalent to :
mca <- speMCA(Music[,1:5],excl=c("FrenchPop.NA","Rap.NA","Jazz.NA","Classical.NA","Rock.NA"))