Utils¶
Convert dataset format¶
- class imml.utils.convert_dataset_format(Xs: list, keys: list = None)[source]¶
Bases:
Convert the format of a multi-modal dataset. If it is a dict, it will be converted to dict, and if it is a list, it will be converted to dict.
- Parameters:
- Returns:
transformed_Xs --
Xs length: n_mods
Xs[key] shape: (n_samples, n_features_i)
- Return type:
dict of array-likes objects.
Examples
>>> from imml.utils.convert_dataset_format import convert_dataset_format >>> import numpy as np >>> import pandas as pd >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> convert_dataset_format(Xs = Xs)
Check Xs¶
- class imml.utils.check_Xs(Xs, enforce_modalities=None, copy=False, ensure_all_finite='allow-nan', return_dimensions=False)[source]¶
Bases:
Checks Xs and ensures it to be a list of 2D matrices. Adapted from ̀mvlearn [1] [2] .
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
enforce_modalities (int, (default=not checked)) -- If provided, ensures this number of modalities in Xs. Otherwise not checked.
copy (boolean, (default=False)) -- If True, the returned Xs is a copy of the input Xs, and operations on the output will not affect the input. If False, the returned Xs is a modality of the input Xs, and operations on the output will change the input.
ensure_all_finite (bool or 'allow-nan', default='allow-nan') --
Whether to raise an error on np.inf, np.nan, pd.NA in array. The possibilities are:
True: Force all values of array to be finite.
False: accepts np.inf, np.nan, pd.NA in array.
'allow-nan': accepts only np.nan and pd.NA values in array. Values cannot be infinite.
return_dimensions (boolean, (default=False)) -- If True, the function also returns the dimensions of the multi-modal dataset. The dimensions are n_mods, n_samples, n_features where n_samples and n_mods are respectively the number of modalities and the number of samples, and n_features is a list of length n_mods containing the number of features of each modality.
References
- Returns:
Xs_converted (object) -- The converted and validated Xs (list of data arrays).
n_mods (int) -- The number of modalities in the dataset. Returned only if
return_dimensionsisTrue.n_samples (int) -- The number of samples in the dataset. Returned only if
return_dimensionsisTrue.n_features (list) -- List of length
n_modscontaining the number of features in each modality. Returned only ifreturn_dimensionsisTrue.