Ampute

Amputer

class imml.ampute.Amputer(p: float = 0.1, mechanism: str = 'mem', weights: list = None, random_state: int = None)[source]

Bases: BaseEstimator, TransformerMixin

Simulate an incomplete multi-modal dataset with block-wise missing data from a fully observed multi-modal dataset.

Parameters:
  • p (float, default=0.1) -- Percentage of incomplete samples.

  • mechanism (str, default="mem") -- One of ["mem", 'mcar', 'mnar', 'pm'], corresponding to mutually exclusive missing, missing completely at random, missing not at random, and partial missing, respectively.

  • weights (list, default=None) -- The probabilities associated with each number of missing modalities. If not given, the sample assumes a uniform distribution. Only used if mechanism = "mnar" or mechanism = "mem".

  • random_state (int, default=None) -- If int, random_state is the seed used by the random number generator.

Example

>>> import numpy as np
>>> import pandas as pd
>>> from imml.ampute import Amputer
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)]
>>> transformer = Amputer(p= 0.2, random_state=42)
>>> transformer.fit_transform(Xs)
fit(Xs: list, y=None)[source]

Fit the transformer to the input data.

Parameters:
  • Xs (list of array-likes objects) --

    • Xs length: n_mods

    • Xs[i] shape: (n_samples, n_features_i)

    A list of different modalities.

  • y (Ignored) -- Not used, present here for API consistency by convention.

Returns:

self

Return type:

returns an instance of self.

transform(Xs: list)[source]

Ampute a fully observed multi-modal dataset.

Parameters:

Xs (list of array-likes objects) --

  • Xs length: n_mods

  • Xs[i] shape: (n_samples, n_features_i)

A list of different modalities.

Returns:

transformed_Xs -- The amputed multi-modal dataset.

Return type:

list of array-likes objects, shape (n_samples, n_features), length n_mods

set_fit_request(*, Xs: bool | None | str = '$UNCHANGED$') Amputer

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

Xs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) -- Metadata routing for Xs parameter in fit.

Returns:

self -- The updated object.

Return type:

object

set_transform_request(*, Xs: bool | None | str = '$UNCHANGED$') Amputer

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

Xs (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) -- Metadata routing for Xs parameter in transform.

Returns:

self -- The updated object.

Return type:

object

Remove Modalities

class imml.ampute.RemoveMods(observed_mod_indicator)[source]

Bases: FunctionTransformer

A transformer that generates block-wise missingness patterns in complete multi-modal datasets. Apply FunctionTransformer (from Scikit-learn) with remove_modalities as a function.

Parameters:

observed_mod_indicator (array-like of shape (n_samples, n_mods)) -- Boolean array-like indicating observed modalities for each sample.

See also

Amputer, remove_mods

Example

>>> import numpy as np
>>> import pandas as pd
>>> from imml.ampute import RemoveMods
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)]
>>> observed_mod_indicator = np.random.default_rng(42).choice(2, size=(len(Xs[0]), len(Xs)))
>>> transformer = RemoveMods(observed_mod_indicator = observed_mod_indicator)
>>> transformer.fit_transform(Xs)
class imml.ampute.remove_mods(Xs: list, observed_mod_indicator)[source]

A function that generates block-wise missingness patterns in complete multi-modal datasets.

Parameters:

Xs (list of array-likes objects) --

  • Xs length: n_mods

  • Xs[i] shape: (n_samples, n_features_i)

A list of different mods.

Returns:

transformed_X -- The transformed multi-modal dataset.

Return type:

list of array-likes objects (n_samples, n_features_i)

See also

Amputer, RemoveMods

Example

>>> import numpy as np
>>> import pandas as pd
>>> from imml.ampute import remove_mods
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)]
>>> observed_mod_indicator = np.random.default_rng(42).choice(2, size=(len(Xs[0]), len(Xs)))
>>> remove_mods(Xs=Xs, observed_mod_indicator = observed_mod_indicator)