piaso.plotting package#

piaso.plotting.plot_embeddings_split(adata, color, splitby, ncol: int = None, dpi: int = 80, col_size: int = 5, row_size: int = 5, vmax: float = None, vmin: float = None, show_figure: bool = True, save: bool = None, layer: str = None, basis: str = 'X_umap', fix_coordinate_ratio: bool = True, show_axis_ticks: bool = False, margin_ratio: float = 0.05, legend_fontsize: int = 10, legend_fontoutline: int = 2, legend_loc: str = 'right margin', legend_marker_size: float = 1.6, x_min=None, x_max=None, y_min=None, y_max=None, **kwargs)#

Plot cell embeddings side by side based on a categorical variable.

The plots are split by a specified categorical variable, with each unique category producing a separate subplot. Data points in each subplot are colored according to the color variable.

Parameters:
  • adata (AnnData) – An AnnData object.

  • color (str) – Used to specify a gene name to plot, or a key in adata.obs used to assign colors to the cells in the embedding plot.

  • splitby (str) – Key in adata.obs used to split the dataset into multiple panels. Each unique value under this key will result in a separate subplot.

  • ncol (int or None, optional (default: None)) – If specified, defines the number of columns per row. If None, the number of columns is computed as the ceiling of n divided by the integer square root of n.

  • dpi (int, optional (default: 80)) – Dots per inch (DPI) setting for the figure.

  • col_size (int, optional (default=5)) – Width (in inches) of each subplot column.

  • row_size (int, optional (default=5)) – Height (in inches) of each subplot row.

  • vmax (float or None, optional (default=None)) – Maximum value for the color scale. If not provided, the upper limit is determined automatically.

  • vmin (float or None, optional (default=None)) – Minimum value for the color scale. If not provided, the lower limit is determined automatically.

  • show_figure (bool, optional (default=True)) – Whether to display the figure after plotting.

  • save (str or None, optional (default=None)) – File path to save the resulting figure. If None, the figure will not be saved.

  • layer (str or None, optional (default=None)) – If specified, the name of the layer in adata.layers from which to obtain the gene expression values.

  • basis (str, optional (default='X_umap')) – Key in adata.obsm that contains the embedding coordinates (e.g., X_umap or X_pca).

  • fix_coordinate_ratio (bool, optional (default=True)) – If True, the aspect ratio of each subplot is fixed so that the x- and y-axes are scaled equally.

  • show_axis_ticks (bool, optional (default=False)) – Whether to display axis ticks and tick labels on the plots.

  • margin_ratio (float, optional (default=0.05)) – Margin ratio for both the x-axis and y-axis limits, relative to the range of the data. This provides additional spacing around the plotted points.

  • legend_fontsize (int, optional (default=9)) – Font size in pt.

  • legend_fontoutline (int, optional (default=2)) – Line width of the legend font outline in pt.

  • legend_loc (str, optional (default='right margin')) – Location of legend, defaults to ‘right margin’.

  • legend_marker_size (float, optional (default=1.5)) – Scaling factor for legend markers (dot size).

  • x_min (float or None, optional (default=None)) – Minimum limit for the x-axis. If None, the limit is computed automatically based on the data.

  • x_max (float or None, optional (default=None)) – Maximum limit for the x-axis. If None, the limit is computed automatically based on the data.

  • y_min (float or None, optional (default=None)) – Minimum limit for the y-axis. If None, the limit is computed automatically based on the data.

  • y_max (float or None, optional (default=None)) – Maximum limit for the y-axis. If None, the limit is computed automatically based on the data.

  • **kwargs (dict) – Additional keyword arguments passed to the scanpy.pl.embedding function.

Return type:

None.

Examples

>>> import scanpy as sc
>>> import piaso
>>> adata = sc.datasets.pbmc3k()  # Load an example dataset
>>> # Plot embeddings colored by a gene expression value and split by clusters
>>> piaso.pl.plot_embeddings_split(adata, color='CDK9', splitby='louvain', col_size=6, row_size=6)
>>> # Save the figure to a file
>>> piaso.pl.plot_embeddings_split(adata, color='CDK9', splitby='louvain', save='./CST3_embeddingsSplit.pdf')
piaso.plotting.plotEmbeddingsSplit(adata, color, splitby, ncol: int = None, dpi: int = 80, col_size: int = 5, row_size: int = 5, vmax: float = None, vmin: float = None, show_figure: bool = True, save: bool = None, layer: str = None, basis: str = 'X_umap', fix_coordinate_ratio: bool = True, show_axis_ticks: bool = False, margin_ratio: float = 0.05, legend_fontsize: int = 10, legend_fontoutline: int = 2, legend_loc: str = 'right margin', legend_marker_size: float = 1.6, x_min=None, x_max=None, y_min=None, y_max=None, **kwargs)#

Plot cell embeddings side by side based on a categorical variable.

The plots are split by a specified categorical variable, with each unique category producing a separate subplot. Data points in each subplot are colored according to the color variable.

Parameters:
  • adata (AnnData) – An AnnData object.

  • color (str) – Used to specify a gene name to plot, or a key in adata.obs used to assign colors to the cells in the embedding plot.

  • splitby (str) – Key in adata.obs used to split the dataset into multiple panels. Each unique value under this key will result in a separate subplot.

  • ncol (int or None, optional (default: None)) – If specified, defines the number of columns per row. If None, the number of columns is computed as the ceiling of n divided by the integer square root of n.

  • dpi (int, optional (default: 80)) – Dots per inch (DPI) setting for the figure.

  • col_size (int, optional (default=5)) – Width (in inches) of each subplot column.

  • row_size (int, optional (default=5)) – Height (in inches) of each subplot row.

  • vmax (float or None, optional (default=None)) – Maximum value for the color scale. If not provided, the upper limit is determined automatically.

  • vmin (float or None, optional (default=None)) – Minimum value for the color scale. If not provided, the lower limit is determined automatically.

  • show_figure (bool, optional (default=True)) – Whether to display the figure after plotting.

  • save (str or None, optional (default=None)) – File path to save the resulting figure. If None, the figure will not be saved.

  • layer (str or None, optional (default=None)) – If specified, the name of the layer in adata.layers from which to obtain the gene expression values.

  • basis (str, optional (default='X_umap')) – Key in adata.obsm that contains the embedding coordinates (e.g., X_umap or X_pca).

  • fix_coordinate_ratio (bool, optional (default=True)) – If True, the aspect ratio of each subplot is fixed so that the x- and y-axes are scaled equally.

  • show_axis_ticks (bool, optional (default=False)) – Whether to display axis ticks and tick labels on the plots.

  • margin_ratio (float, optional (default=0.05)) – Margin ratio for both the x-axis and y-axis limits, relative to the range of the data. This provides additional spacing around the plotted points.

  • legend_fontsize (int, optional (default=9)) – Font size in pt.

  • legend_fontoutline (int, optional (default=2)) – Line width of the legend font outline in pt.

  • legend_loc (str, optional (default='right margin')) – Location of legend, defaults to ‘right margin’.

  • legend_marker_size (float, optional (default=1.5)) – Scaling factor for legend markers (dot size).

  • x_min (float or None, optional (default=None)) – Minimum limit for the x-axis. If None, the limit is computed automatically based on the data.

  • x_max (float or None, optional (default=None)) – Maximum limit for the x-axis. If None, the limit is computed automatically based on the data.

  • y_min (float or None, optional (default=None)) – Minimum limit for the y-axis. If None, the limit is computed automatically based on the data.

  • y_max (float or None, optional (default=None)) – Maximum limit for the y-axis. If None, the limit is computed automatically based on the data.

  • **kwargs (dict) – Additional keyword arguments passed to the scanpy.pl.embedding function.

Return type:

None.

Examples

>>> import scanpy as sc
>>> import piaso
>>> adata = sc.datasets.pbmc3k()  # Load an example dataset
>>> # Plot embeddings colored by a gene expression value and split by clusters
>>> piaso.pl.plot_embeddings_split(adata, color='CDK9', splitby='louvain', col_size=6, row_size=6)
>>> # Save the figure to a file
>>> piaso.pl.plot_embeddings_split(adata, color='CDK9', splitby='louvain', save='./CST3_embeddingsSplit.pdf')
piaso.plotting.plot_features_violin(adata, feature_list, groupby: str | None = None, use_raw: bool | None = None, layer: str | None = None, width_single: float = 14.0, height_single: float = 2.0, size: float = 0.1, show_grid: bool = True, show_figure: bool = True, save: str | None = None)#

Plots a violin plot for each feature specified in feature_list using the Scanpy library (sc.pl.violin).

Parameters:
  • adata (anndata.AnnData) – The annotated data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.

  • feature_list (List[str]) – A list of strings denoting the feature names (gene names and cell metrics, e.g., number of genes detected and doublet score) to be visualized.

  • groupby (str, optional) – A key in the observation DataFrame (adata.obs) used to group data points in the violin plot. Default is None, which means no grouping is applied.

  • use_raw (bool, optional) – A boolean indicating whether to use the raw attribute of adata. If True, uses raw data if available.

  • layer (str, optional) – A key from the layers of adata. If provided, the specified layer is used for visualization.

  • width_single (float, optional) – The width of each subplot. Default is 14.0.

  • height_single (float, optional) – The height of each subplot. Default is 2.0.

  • size (float, optional) – The size of the jitter points in the violin plot. Default is 0.1.

  • show_grid (bool, optional) – Whether to display grid lines in the plots. Default is True (grid lines shown).

  • show_figure (bool, optional) – Whether to show the figure (plt.show()). Default is True.

  • save (str, optional) – If provided, the path where the plot should be saved, e.g., ./violin_plot_by_piaso.pdf. If None, the plot is not saved to a file.

Returns:

This function does not return any value but visualizes the violin plots and optionally saves the figure.

Return type:

None

piaso.plotting.plotLigandReceptorInteraction(interactions_df: DataFrame, specificity_df: DataFrame, cell_type_pairs: list, top_n: int = 50, y_max: int = 10, cell_type_sep: str = '@', ligand_receptor_sep: str = '-->', heatmap_height_ratio: float = 1.5, heatmap_cmap: str = 'Purples', heatmap_cmap_ligand: str = None, heatmap_cmap_receptor: str = None, shared_legend: bool = False, heatmap_vmax: float = None, save_path: str = None, fig_width: int = 24, fig_height_per_pair: int = 9, col_interaction_score: str = 'interaction_score', col_ligand_receptor_pair: str = 'ligandXreceptor', col_cell_type_pair: str = 'CellTypeXCellType', col_annotation: str = 'annotation', col_ligand: str = 'ligand', col_receptor: str = 'receptor', vertical_layout: bool = False, color_labels_by_annotation: bool = False, barplot_palette: str = 'Paired', sort_by_category: bool = False, category_agg_method: str = 'sum', preserve_input_order: bool = False)#

Generates plots with a bar plot of top interactions and a heatmap showing ligand and receptor specificity, with an option for vertical orientation.

Parameters:
  • interactions_df (pd.DataFrame) – DataFrame with interaction scores.

  • specificity_df (pd.DataFrame) – DataFrame with gene specificity scores.

  • cell_type_pairs (list) – A list of ‘CellTypeXCellType’ strings to plot.

  • top_n (int) – The number of top interactions to display.

  • y_max (int) – The maximum y-axis value for the bar plot.

  • cell_type_sep (str) – The separator for sender/receiver cell types.

  • ligand_receptor_sep (str) – The separator for ligand/receptor genes.

  • heatmap_height_ratio (float) – The height/width ratio of the heatmap relative to the bar plot.

  • heatmap_cmap (str) – The colormap for the specificity heatmap (used when ligand/receptor cmaps not specified).

  • heatmap_cmap_ligand (str) – The colormap for ligand specificity. If None, uses heatmap_cmap.

  • heatmap_cmap_receptor (str) – The colormap for receptor specificity. If None, uses heatmap_cmap.

  • shared_legend (bool) – If True, a single legend/colorbar is shown for all plots.

  • heatmap_vmax (float) – The maximum value for the heatmap color scale.

  • save_path (str, optional) – Path to save the figure (e.g., ‘plot.pdf’).

  • fig_width (int) – For horizontal layout, the total figure width. For vertical, this controls the total figure HEIGHT.

  • fig_height_per_pair (int) – For horizontal layout, height per subplot. For vertical, this controls the WIDTH of each subplot group.

  • col_interaction_score (str) – Column name for interaction scores.

  • col_ligand_receptor_pair (str) – Column name for ligand-receptor pair strings.

  • col_cell_type_pair (str) – Column name for cell type pair strings.

  • col_annotation (str) – Column name for pathway/annotation data.

  • col_ligand (str) – Column name for ligand after splitting.

  • col_receptor (str) – Column name for receptor after splitting.

  • vertical_layout (bool) – If True, plots are arranged horizontally (rotated 90 degrees).

  • color_labels_by_annotation (bool) – If True, color ligand-receptor labels by their annotation category.

  • barplot_palette (str or list) – Color palette for bar plots. Can be a seaborn palette name (e.g., ‘Paired’, ‘Set1’) or a list of hex colors (e.g., [‘#F198CC’, ‘#D6DAB9’, ‘#BC938B’]).

  • sort_by_category (bool) – If True, sort interactions by category first, then by interaction score within category.

  • category_agg_method (str) – Method to aggregate interaction scores by category (‘sum’ or ‘mean’) when sort_by_category=True.

  • preserve_input_order (bool) – If True, preserve the original order from interactions_df without any sorting.

Examples

# Preserve original input order plotLigandReceptorInteraction(

interactions_df=specific_interactions, specificity_df=cosg_scores, cell_type_pairs=[‘L5 NP@SST-Chrna2’], preserve_input_order=True, # Use original DataFrame order vertical_layout=False

)

# Horizontal layout with category sorting plotLigandReceptorInteraction(

interactions_df=specific_interactions, specificity_df=cosg_scores, cell_type_pairs=[‘L5 NP@SST-Chrna2’, ‘L5 PT@SST-Chrna2’], ligand_receptor_sep=’–>’, top_n=50, y_max=10, heatmap_cmap_ligand=’Blues’, heatmap_cmap_receptor=’Reds’, shared_legend=True, vertical_layout=False, sort_by_category=True, category_agg_method=’sum’, color_labels_by_annotation=True

)

# Vertical layout with custom hex colors plotLigandReceptorInteraction(

interactions_df=specific_interactions_cellchat, specificity_df=cosg_scores, cell_type_pairs=[‘L5 NP@SST-Chrna2’], ligand_receptor_sep=’–>’, top_n=50, y_max=10, heatmap_cmap_ligand=’Purples’, heatmap_cmap_receptor=’Reds’, shared_legend=True, vertical_layout=True, barplot_palette=[‘#F198CC’, ‘#D6DAB9’, ‘#BC938B’, ‘#93DCFC’, ‘#F4DBCD’, ‘#bcf60c’], sort_by_category=True, category_agg_method=’mean’

)

Raises:
  • ValueError – If required columns are missing from DataFrames or if data is inconsistent.

  • KeyError – If specified cell type pairs are not found in the data.

piaso.plotting.plotConfusionMatrix(data, groupby_query, groupby_reference, normalize='query', figsize=(11.5, 10), cmap='Purples', annot=False, fmt='.2f', title=None, save_path=None, dpi=300, return_objects=False, show_group_color_bars=False, **kwargs)#

Plot a normalized and reordered confusion matrix from clustering results.

This function creates a confusion matrix heatmap with SVD-based reordering for better visualization of cluster relationships. The matrix can be normalized in different ways and customized extensively.

Parameters:
  • data (pandas.DataFrame or AnnData) – DataFrame or AnnData object containing the data. If AnnData, will use data.obs for the analysis.

  • groupby_query (str) – Column name for the query labels (typically predicted clusters).

  • groupby_reference (str) – Column name for the reference labels (typically true labels).

  • normalize (str) – How to normalize the confusion matrix. Options: - ‘query’: normalize by query (row-wise) - default - ‘reference’: normalize by reference (column-wise) - ‘all’: normalize by total count - None: no normalization

  • figsize (tuple) – Figure size for the plot. Default is (11.5, 10).

  • cmap (str) – Colormap for the heatmap. Default is ‘Purples’.

  • annot (bool) – Whether to show annotations in cells. Default is False.

  • fmt (str) – Format for annotations. Default is ‘.2f’.

  • title (str) – Custom title for the plot. If None, generates automatic title.

  • save_path (str) – Path to save the figure. If None, only displays.

  • dpi (int) – DPI for saved figure. Default is 300.

  • return_objects (bool) – If True, returns (confusion_matrix, fig, ax). Default is False.

  • show_group_color_bars (bool) – If True, shows colored bars next to ticks for categories that have colors defined in adata.uns (e.g., ‘CellTypes_colors’). Default is False.

  • **kwargs – Additional arguments passed to sns.heatmap()

Returns:

If return_objects=True, returns (reordered_confusion_matrix, fig, ax) for further customization

Return type:

None (default) or tuple

Examples

Basic usage with AnnData object: >>> import scanpy as sc >>> import pandas as pd >>> # Load your data >>> adata = sc.read_h5ad(‘your_data.h5ad’) >>> # Plot confusion matrix between cell types and Leiden clusters >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’)

Using a pandas DataFrame: >>> df = pd.DataFrame({ … ‘CellTypes’: [‘T_cell’, ‘B_cell’, ‘Monocyte’, ‘T_cell’, ‘B_cell’], … ‘Leiden’: [‘0’, ‘1’, ‘2’, ‘0’, ‘1’] … }) >>> plotConfusionMatrix(df, groupby_query=’CellTypes’, groupby_reference=’Leiden’)

Different normalization methods: >>> # Normalize by reference (column-wise) >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … normalize=’reference’) >>> >>> # No normalization, show raw counts >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … normalize=None) >>> >>> # Normalize by total count >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … normalize=’all’)

Customization options: >>> # Custom colors and show detailed values >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … cmap=’viridis’, annot=True) >>> >>> # Custom figure size and save to file >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … figsize=(15, 12), … save_path=’confusion_matrix.png’, … title=’Cell Types vs Leiden Clusters’)

Show detailed values in the plot: >>> # Display percentage values in each cell >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … annot=True, fmt=’.1%’) >>> >>> # Display raw counts (with no normalization) >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … normalize=None, annot=True, fmt=’d’)

Show color bars for categories: >>> # Display colored bars next to ticks (requires colors in adata.uns) >>> plotConfusionMatrix(adata, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … show_group_color_bars=True) >>> # This will look for ‘CellTypes_colors’ and ‘Leiden_colors’ in adata.uns

Advanced usage - getting results for further analysis: >>> conf_matrix, fig, ax = plotConfusionMatrix(adata, … groupby_query=’CellTypes’, … groupby_reference=’Leiden’, … return_objects=True) >>> # Access the reordered confusion matrix >>> print(conf_matrix.head()) >>> # Further customize the plot >>> ax.set_title(‘Custom Title’, fontsize=16) >>> plt.show()

Using with different data sources: >>> # From Seurat object converted to pandas >>> seurat_df = pd.read_csv(‘seurat_metadata.csv’) >>> plotConfusionMatrix(seurat_df, groupby_query=’CellTypes’, groupby_reference=’Leiden’) >>> >>> # From flow cytometry data >>> flow_df = pd.read_csv(‘flow_cytometry_results.csv’) >>> plotConfusionMatrix(flow_df, groupby_query=’CellTypes’, groupby_reference=’Leiden’, … normalize=’reference’, cmap=’Reds’, annot=True)