Visualize

Plot combinations

class imml.visualize.plot_combinations(Xs: list, mod_names: list = None, figsize: tuple = None, max_combs: int = 10)[source]

Bases:

Plot the number of samples per modality combination.

This function summarizes how many samples are present in each intersection of modalities (i.e., samples that are available simultaneously across two or more modalities). The resulting figure is similar to an UpSet plot, but it displays the exact counts of intersections.

Parameters:
  • Xs (list of array-like objects, default=None) --

    • Xs length: n_mods

    • Xs[i] shape: (n_samples, n_features_i)

    A list of different modalities. Only used when summary is not provided.

  • mod_names (list of str, default=None) -- Names of each modality (length must match len(Xs)). If None, modality names default to indices: ["0", "1", ...].

  • figsize (tuple, default=None) -- Figure size in inches passed to matplotlib.pyplot.subplots.

  • max_combs (int, default=10) -- Maximum number of intersections to display. If fewer intersections are available, all will be shown.

Returns:

  • fig (matplotlib.figure.Figure) -- The created matplotlib Figure.

  • axes (numpy.ndarray of matplotlib.axes.Axes) -- 2 x 2 array of Axes as described in the layout above.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from imml.visualize import plot_combinations
>>> from imml.ampute import Amputer
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for _ in range(3)]
>>> Xs = Amputer(p=0.3, random_state=42).fit_transform(Xs)
>>> fig, axes = plot_combinations(Xs=Xs, mod_names=['RNA', 'Protein', 'Metabolite'], max_combs=8)

Plot missing modality

class imml.visualize.plot_missing_modality(Xs, ax: Axes = None, figsize: tuple = None, sort: bool = True)[source]

Bases:

Plot modality missing. Missing modalities appear as white, while black indicates available modalities.

Parameters:
  • Xs (list of array-likes objects, default=None) --

    • Xs length: n_mods

    • Xs[i] shape: (n_samples, n_features_i)

    A list of different modalities. If rus is provided, it will not be used.

  • ax (matplotlib.axes.Axes, default=None) -- Axes where to draw the figure.

  • figsize (tuple, default=None) -- Figure size (tuple) in inches.

  • sort (bool, default=True) -- If True, samples will be sort based on their available modalities.

Returns:

  • fig (matplotlib.figure.Figure) -- Figure object.

  • ax (matplotlib.axes.Axes) -- Axes object.

Example

>>> import numpy as np
>>> import pandas as pd
>>> from imml.ampute import Amputer
>>> from imml.visualize import plot_missing_modality
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)]
>>> transformer = Amputer(p= 0.2, random_state=42)
>>> Xs = transformer.fit_transform(Xs)
>>> plot_missing_modality(Xs=Xs)

Plot PID

class imml.visualize.plot_pid(rus=None, Xs=None, y=None, mod_names: list = ['Modality A', 'Modality B'], colors: list = ['#780000', '#669BBC', '#FDF0D5'], abb: bool = True, figsize: tuple = None, **kwargs)[source]

Bases:

Plot PID statistics (redundancy, uniqueness and synergy) of a multi-modal dataset as a Venn diagram.

Parameters:
  • rus (list or dict, default=None) -- The output of the pid function.

  • Xs (list of array-likes objects, default=None) --

    • Xs length: n_mods

    • Xs[i] shape: (n_samples, n_features_i)

    A list of different mod_names. If rus is provided, it will not be used.

  • y (array-like of shape (n_samples,), default=None) -- Target vector relative to Xs. If rus is provided, it will not be used.

  • mod_names (list, default=["Modality A", "Modality B"]) -- Name of each modality.

  • colors (list, default=["#780000", "#669BBC", "#FDF0D5"]) -- Colors used for the regions.

  • abb (bool, default=True) -- Whether to use abbreviations (S, U1, U2 and R) for "Synergy", "Uniquesness1", "Uniqueness2" and "Redundancy", respectively.

  • figsize (tuple, default=None) -- Figure size (tuple) in inches.

  • **kwargs (dict, default=None) -- Additional keyword arguments are passed to the pid function.

Returns:

  • fig (matplotlib.figure.Figure) -- Figure object.

  • ax (matplotlib.axes.Axes) -- Axes object.

See also

pid

Example

>>> import numpy as np
>>> import pandas as pd
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)]
>>> y = pd.Series(np.random.default_rng(42).uniform(low=0, high=2, size=len(Xs[0])))
>>> plot_pid(Xs = Xs, y=y, **{"random_state":42})

Plot summary

class imml.visualize.plot_summary(Xs: list = None, summary: DataFrame = None, mod_names: list = None, figsize: tuple = None, title: str = None, xlabel: str = None, ylabel: str = 'Count')[source]

Bases:

Plot a bar chart summarizing completeness across modalities in a multi-modal dataset.

Parameters:
  • Xs (list of array-like objects, default=None) --

    • Xs length: n_mods

    • Xs[i] shape: (n_samples, n_features_i)

    A list of different modalities. Only used when summary is not provided.

  • summary (pd.DataFrame, default=None) -- A summary dataframe as returned by imml.explore.get_summary. If provided, it will be plotted directly. If None, the summary will be computed from Xs.

  • mod_names (list, default=None) -- Names of each modality to use when computing the summary from Xs. If None, it will default to the modality index.

  • figsize (tuple, default=None) -- Figure size in inches passed to pd.DataFrame.plot.

  • title (str, default="Summary of the multi-modal dataset") -- Title of the plot.

  • xlabel (str, default="Samples") -- Label for the x-axis.

  • ylabel (str, default="Count") -- Label for the y-axis.

Returns:

The matplotlib Axes containing the bar plot.

Return type:

matplotlib.axes.Axes

Example

>>> import numpy as np
>>> import pandas as pd
>>> from imml.visualize import plot_summary
>>> from imml.ampute import Amputer
>>> Xs = [pd.DataFrame(np.random.default_rng(42).random((20, 10))) for i in range(3)]
>>> Xs = Amputer(p=0.3, random_state=42).fit_transform(Xs)
>>> plot_summary(Xs = Xs)