iMML

Overview
Installation
Algorithm selection guide
- How to install an additional module
- How to install extra dependencies
  - Extra dependencies when using "octave" module
  - Extra dependencies when using "r" module
Tutorials
API Reference

Development

Contributing to iMML
Project roadmap
Changelog
License

Links

GitHub
PyPI

imml

API Reference
Model selection
Edit on GitHub

Model selection¶

Multi-Modal Splitter¶

class imml.model_selection.MMSplitter(splitter, return_type: str = 'sets')[source]¶

Generic bridge between scikit-learn splitters and multi-modal inputs.

This helper receives any scikit-learn splitter (such as StratifiedKFold) and yields splits. A single set of train/test indices is computed by the splitter and applied to every modality, guaranteeing aligned partitions across all modalities.

Parameters:

splitter (object) -- Any object implementing scikit-learn's splitter interface, for example KFold, StratifiedKFold, GroupKFold or ShuffleSplit.
return_type (str, default="sets") -- Controls what each yielded item contains: "sets" returns the actual partition sets, while "indices" return the indices of the partition sets.

Example

>>> import numpy as np
>>> from sklearn.model_selection import StratifiedKFold
>>> from imml.model_selection import MMSplitter
>>> Xs = [np.random.rand(100, 10), np.random.rand(100, 20)]
>>> y = np.random.randint(0, 2, 100)
>>> splitter = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
>>> for Xs_train, Xs_test, y_train, y_test in MMSplitter(splitter=splitter).split(Xs, y):
...     pass

get_n_splits(X=None, y=None, groups=None)[source]¶

Returns the number of splitting iterations as set with the n_splits param.

Parameters:

Xs (Always ignored, exists for API compatibility.)
y (Always ignored, exists for API compatibility.)
groups (Always ignored, exists for API compatibility.)

Returns:

n_splits -- Returns the number of splitting iterations.

Return type:

int

split(Xs, y=None, groups=None)[source]¶

Generate indices to split data into training and test set.

Parameters:

Xs (list of array-like) --
- Xs length: n_mods
- Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
y (array-like of shape (n_samples,), optional) -- Target vector relative to Xs.
groups (array-like, optional) -- Group labels passed to splitter.split(...).

Returns:

One tuple per split according to return_type.

Return type:

tuple

Train-Test Multi-Modal Split¶

imml.model_selection.train_test_mm_split(Xs, y=None, **kwargs)[source]¶

Split multi-modal datasets and labels into train and test sets.

Similar to sklearn's train_test_split, but works with lists of arrays/data (Xs) and single arrays (y). Ensures that all X in a Xs get the same train/test split indices.

Parameters:

Xs (list of array-like) --
- Xs length: n_mods
- Xs[i] shape: (n_samples, n_features_i)
A list of different modalities.
y (array-like of shape (n_samples,), optional) -- Target vector relative to Xs.
**kwargs (dict) -- Additional keyword arguments to pass to sklearn's train_test_split.

Returns:

Splitting results in the same order as inputs: - For each list input (Xs): (list_train, list_test) - For each array input (y): (array_train, array_test)

Return type:

tuple

Example

>>> import numpy as np
>>> from imml.model_selection import train_test_mm_split
>>> Xs = [np.random.rand(100, 10), np.random.rand(100, 20)]
>>> y = np.random.randint(0, 2, 100)
>>> Xs_train, Xs_test, y_train, y_test = train_test_mm_split(Xs, y, train_size=0.7, random_state=42)

Previous Next

Built with Sphinx using a theme provided by Read the Docs.