Explore¶
Get summary¶
- class imml.explore.get_summary(Xs: list, mod_names: list = None, one_row: bool = False, compute_pct: bool = True, return_df: bool = False)[source]¶
Bases:
Get a summary of an incomplete multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
mod_names (list, default=None) -- Name of each modality. By default, it will be set to the modality index. Only applicable when one_row is False.
one_row (bool, default=False) -- If True, return a one-row summary of the dataset. If False, each row will correspond to a modality.
compute_pct (bool, default=True) -- If True, compute percent of each value.
return_df (bool, default=False) -- If True, it will return a pd.DataFrame. It returns a dict otherwise.
- Returns:
summary -- Summary of a multi-modal dataset.
- Return type:
dict or pd.DataFrame
See also
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_summary >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> get_summary(Xs = Xs)
Get number of modalities¶
- class imml.explore.get_n_mods(Xs: list)[source]¶
Bases:
Get the number of modalities of a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
n_mods -- Number of modalities.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_n_mods >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> get_n_mods(Xs = Xs)
Get number of samples by modalities¶
- class imml.explore.get_n_samples_by_mod(Xs: list)[source]¶
Bases:
Get the number of samples in each modality.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
n_samples_by_mod -- Number of samples in each modality.
- Return type:
pd.Series
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_n_samples_by_mod >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_n_samples_by_mod(Xs = Xs)
Get complete samples¶
- class imml.explore.get_com_samples(Xs: list)[source]¶
Bases:
Get name (index) of complete samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
samples -- Sample names with full data.
- Return type:
pd.Index
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_com_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_com_samples(Xs = Xs)
Get incomplete samples¶
- class imml.explore.get_incom_samples(Xs: list)[source]¶
Bases:
Get name (index) of incomplete samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
samples -- Sample names with incomplete data.
- Return type:
pd.Index
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_incom_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_incom_samples(Xs = Xs)
Get samples¶
- class imml.explore.get_samples(Xs: list)[source]¶
Bases:
Get name (index) of samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples_i, n_features_i)
A list of different modalities.
- Returns:
samples -- Sample names.
- Return type:
pd.Index (n_samples,)
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_samples(Xs = Xs)
Get samples by modality¶
- class imml.explore.get_samples_by_mod(Xs: list, return_as_list: bool = True)[source]¶
Bases:
Get the samples for each modality in a multi-modal dataset.
- Parameters:
- Returns:
samples -- If list, each element in the list is the sample names for each modality. If dict, keys are the modalities and the values are the sample names.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_samples_by_mod >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_samples_by_mod(Xs = Xs)
Get missing samples by modality¶
- class imml.explore.get_missing_samples_by_mod(Xs: list, return_as_list: bool = True)[source]¶
Bases:
Get the samples not present in each modality in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
return_as_list (bool, default=True) -- If list, each element in the list is the sample names for each modality. If dict, keys are the modalities and the values are the sample names.
- Returns:
samples -- Dictionary or list of missing samples for each modality.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_missing_samples_by_mod >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_missing_samples_by_mod(Xs = Xs)
Get number of complete samples¶
- class imml.explore.get_n_com_samples(Xs: list)[source]¶
Bases:
Get the number of complete samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
n_samples -- number of complete samples.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_n_com_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_n_com_samples(Xs = Xs)
Get number of incomplete samples¶
- class imml.explore.get_n_incom_samples(Xs: list)[source]¶
Bases:
Get the number of incomplete samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
n_samples -- number of incomplete samples.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_n_incom_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_n_incom_samples(Xs = Xs)
Get percentage of complete samples¶
- class imml.explore.get_pct_com_samples(Xs: list)[source]¶
Bases:
Get the percentage of complete samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
percentage_samples -- percentage of complete samples.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_pct_com_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_pct_com_samples(Xs = Xs)
Get percentage of incomplete samples¶
- class imml.explore.get_pct_incom_samples(Xs: list)[source]¶
Bases:
Get the percentage of incomplete samples in a multi-modal dataset.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
- Returns:
percentage_samples -- percentage of incomplete samples.
- Return type:
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.explore import get_pct_incom_samples >>> from imml.ampute import Amputer >>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)] >>> Xs = Amputer(p=0.2, mechanism="mcar", random_state=42).fit_transform(Xs) >>> get_pct_incom_samples(Xs = Xs)