Feature selection¶
JNMF Feature Selector¶
- class imml.feature_selection.JNMFFeatureSelector(select_by: str = 'component', f_per_component: int = 1, **kwargs)[source]¶
Bases:
JNMFFeature selection for multi-modal datasets using the Joint Non-negative Matrix Factorization (JNMF) method. [1] [2] [3] [4] [5] [6] [7] [8]
This class extends the functionality of the JNMF method to perform feature selection across multiple modalities or blocks of data. The selected features are those with the highest contributions to the derived components from JNMF. This feature selection can be based on either the largest contribution for each component, the maximum overall contribution, or the average contribution across all components.
- Parameters:
select_by (str, default="component") --
Criterion used to select features. Must be one of ["component", "max", "average"]:
"component": Selects the feature with the largest contribution for each component.
"max": Selects the features with the largest overall contribution.
"average": Selects the features with the highest average contribution across all components.
f_per_component (int, default=1) --
Number of features to select per component.
If select_by="component", this controls how many features are selected for each component.
If select_by="max", the top n_components * f_per_component features across all components are selected.
If select_by="average", it selects n_components * f_per_component features with the highest average contribution for each component.
kwargs (dict) -- Arguments passed to the JNMF method.
- selected_features_¶
List of selected features.
- weights_¶
The importance or contribution scores of the selected features in absolute values. These scores reflect how strongly each feature contributes to the components derived from JNMF.
References
See also
Example
>>> import numpy as np >>> import pandas as pd >>> from imml.feature_selection import JNMFFeatureSelector >>> Xs = [pd.DataFrame(np.random.default_rng(42).uniform(size=(20, 10))) for i in range(3)] >>> transformer = JNMFFeatureSelector(n_components = 5) >>> transformed_Xs = transformer.fit_transform(Xs)
- fit(Xs, y=None)[source]¶
Fit the transformer to the input data.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
y (Ignored) -- Not used, present here for API consistency by convention.
- Returns:
self
- Return type:
returns an instance of self.
- fit_transform(Xs, y=None, **fit_params)[source]¶
Fit to data, then transform it.
- Parameters:
Xs (list of array-likes objects) --
Xs length: n_mods
Xs[i] shape: (n_samples_i, n_features_i)
A list of different mods.
y (Ignored) -- Not used, present here for API consistency by convention.
fit_params (Ignored) -- Not used, present here for API consistency by convention.
- Returns:
transformed_X -- The projected data.
- Return type:
array-likes objects of shape (n_samples, n_components)
- set_fit_request(*, Xs: bool | None | str = '$UNCHANGED$') JNMFFeatureSelector¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_transform_request(*, Xs: bool | None | str = '$UNCHANGED$') JNMFFeatureSelector¶
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.