mvdalab

Multivariate Data Analysis Laboratory

An open-source project implementation of latent variable methods and multivariate modeling tools. The focus is on exploratory analyses using dimensionality reduction methods including low dimensional embeddings, classical multivariate statistical tools , and tools for enhanced interpretation of machine learning methods (i.e. intelligible models to provide important information for end-users).   It contains amongst others PCA, Biplots, NIPALS PCA, PLS, several VIP methods, missing value handling,  Procrustes analysis, Multivariate EWMA and data simulation functions. Target domains include extension to dedicated applications e.g. for manufacturing process modeling, spectroscopic analyses, and data mining.

CRAN Link:  https://cran.rstudio.com/web/packages/mvdalab/index.html

mvdalab focuses on methods that fall under a multivariate analysis umbrella with multiple layers:

  • 1st layer (core):  includes standard multivariate methods (such as PLS, PCA, etc.)
  • 2nd layer: widely accepted methods that can be divided into the following areas:  Kernel methods, Tree-based methods, Clustering, Batch, Multiways, etc. [as we extend to more complicated methods]
  • 3nd layer:  research layer that incorporates functions linked to recent published work. In long run, some of these work/tools could be moved to layers 1 or 2.

Current mvdalab highlights:

  • PLS supported by bootstrapping (percentile and BCa intervals for model outputs); jackknife-after-boot implemented as well and bootstrap distribution diagnostic graphs (1st layer)

o     Bootstrapping speed increased via parallel processing

  • Variable Importance Methods (2nd & 3rd layer)
  • Variable contribution plots for latent spaces (3rd layer)
  • PCA (Kernel PCA coming soon) (1st layer)
  • NIPALS PCA (1st layer)
  • Missing value handling functions (3rd layer)
  • Procrustes analysis (2nd layer)
  • Multivariate EWMA (2nd layer)
  • Multivariate comparison of mean vectors (1st layer)
  • BiPlots (1st layer)
  • T2 plots (1st layer)
  • Simulating from Multivariate Normal, Poisson, Exponential, or Skewed Distributions

(2nd layer)

Invitation for developers to join this project:

mvdalab is a true open-source project in that it solicits input from the data analysis community and looks to incorporate user-contributed functions in order to expand its current toolbox.

Everyone is invited to contribute functions to mvdalab for a future release with the developer getting full credit for a functions development the Help page for the function.  Those who over time make significant contributions will also have their name added as an Author to the R package.

Please contact any member of the core team with any questions.

Nelson Lee Afanador:  nelson.afanador@mvdalab.com
Thanh Tran:  thanh.tran@mvdalab.com
Lionel Blanchet:  lionel.blanchet@mvdalab.com