Model selection¶
Multi-Modal Splitter¶
- imml.model_selection.MMSplitter(splitter, return_type: str = 'all')[source]¶
Generic bridge between scikit-learn splitters and multi-modal inputs.
This helper receives any scikit-learn splitter (such as StratifiedKFold) and yields splits. A single set of train/test indices is computed by the splitter and applied to every modality, guaranteeing aligned partitions across all modalities.
- Parameters:
splitter (object) -- Any object implementing scikit-learn's splitter interface, for example
KFold,StratifiedKFold,GroupKFoldorShuffleSplit.return_type (str, default="split") -- Controls what each yielded item contains: "split" returns the actual partition sets, while "indices" return the indices of the partition sets.
Example
>>> import numpy as np >>> from sklearn.model_selection import StratifiedKFold >>> from imml.model_selection import MMSplitter >>> Xs = [np.random.rand(100, 10), np.random.rand(100, 20)] >>> y = np.random.randint(0, 2, 100) >>> splitter = StratifiedKFold(n_splits=5, shuffle=True, random_state=42) >>> for Xs_train, Xs_test, y_train, y_test in MMSplitter(splitter=splitter).split(Xs, y): ... pass
Train-Test Multi-Modal Split¶
- imml.model_selection.train_test_mm_split(Xs, y=None, **kwargs)[source]¶
Split multi-modal datasets and labels into train and test sets.
Similar to sklearn's train_test_split, but works with lists of arrays/data (Xs) and single arrays (y). Ensures that all X in a Xs get the same train/test split indices.
- Parameters:
- Returns:
Splitting results in the same order as inputs: - For each list input (Xs): (list_train, list_test) - For each array input (y): (array_train, array_test)
- Return type:
Example
>>> import numpy as np >>> from imml.model_selection import train_test_mm_split >>> Xs = [np.random.rand(100, 10), np.random.rand(100, 20)] >>> y = np.random.randint(0, 2, 100) >>> Xs_train, Xs_test, y_train, y_test = train_test_mm_split(Xs, y, train_size=0.7, random_state=42)