Data Fusion by Matrix Factorization (DFMF). [1][2]
DMFM is a data fusion approach with penalized matrix tri-factorization (DFMF) that simultaneously factorizes
data matrices to reveal hidden associations.
This method can deal with both block- and single-wise missing.
Parameters:
n_components (int, default=10) -- Number of components to keep.
max_iter (int, default=100) -- Maximum number of iterations to perform.
init_type (str or list of str, default='random_c') -- The algorithm to initialize latent matrix factors. Options are 'random', 'random_c' and 'random_vcol'. It can be
a list, each item being for fit and transform, respectively.
n_run (int, default=1) -- Number of components to keep.
stopping (tuple (target_matrix, eps), default=None) -- Terminate iteration if the reconstruction error of target matrix improves by less than eps.
stopping_system (float, default=None) -- Terminate iteration if the reconstruction error of the fused system improves by less than eps. compute_err is
to True to compute the error of the fused system.
compute_err (bool, default=False) -- Compute the reconstruction error of every relation matrix if True.
callback (callable, default=None) -- An optional user-supplied function to call after each iteration. Called as callback(G, S, cur_iter), where
S and G are current latent estimates.
fill_value (float, default=0) -- Value to use to initially fill missing values.
random_state (int, default=None) -- Determines the randomness. Use an int to make the randomness deterministic.
n_jobs (int, default=None) -- Number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means
using all processors.
JNMF decompose the matrices to low-dimensional factor matrices.
It can deal with both modality- and feature-wise missing.
Parameters:
n_components (int, default=10) -- Number of components to keep.
init_W (array-like, default=None) -- The initial values of factor matrix W, which has n_samples-rows and n_components-columns.
init_V (array-like, default=None) -- A list containing the initial values of multiple factor matrices.
init_H (array-like, default=None) -- A list containing the initial values of multiple factor matrices.
l1_W (float, default=1e-10) -- Paramter for L1 regularitation. This also works as small positive constant to prevent division by zero,
so should be set as 0.
l1_V (float, default=1e-10) -- Paramter for L1 regularitation. This also works as small positive constant to prevent division by zero,
so should be set as 0.
l1_H (float, default=1e-10) -- Paramter for L1 regularitation. This also works as small positive constant to prevent division by zero,
so should be set as 0.
l2_W (float, default=1e-10) -- Parameter for L2 regularitation.
l2_V (float, default=1e-10) -- Parameter for L2 regularitation.
l2_H (float, default=1e-10) -- Parameter for L2 regularitation.
MOFA is a factor analysis model that provides a general framework for the integration of (originally, multi-omic
data sets) incomplete multi-modal datasets, in an unsupervised fashion. Intuitively, MOFA can be viewed as a
versatile and statistically rigorous generalization of principal component analysis to multi-modal data. Given
several data matrices with measurements of multiple data types on the same or on overlapping sets of
samples, MOFA infers an interpretable low-dimensional representation in terms of a few latent factors.
It can deal with both modality- and feature-wise missing.
Parameters:
n_components (int, default=10) -- Number of components to keep.
impute (bool, default=True) -- True if missing values should be imputed.
data_options (dict, default=None) -- Data processing options, such as scale_views and scale_groups.
data_matrix (dict, default=None) -- Keys such as likelihoods, view_names, etc.
model_options (dict, default=None) -- Model options, such as ard_factors or ard_weights.
train_options (dict, default=None) -- Keys such as iter, tolerance.
stochastic_options (dict, default=None) -- Stochastic variational inference options, such as learning rate or batch size.
covariates (dict, default=None) -- Slot to store sample covariate for training in MEFISTO. Keys are sample_cov and covariates_names.
smooth_options (dict, default=None) -- options for smooth inference, such as scale_cov or model_groups.
random_state (int, default=None) -- Determines the randomness. Use an int to make the randomness deterministic.