laris.tools package#
LARIS Tools Module (laris.tl)
Core analytical tools for ligand-receptor interaction analysis in spatial transcriptomics data.
This module contains public functions for: - Preparing ligand-receptor integration scores with spatial diffusion - Running the LARIS algorithm to identify spatially-specific LR interactions - Computing cell type-specific interaction scores
Main Functions: - prepareLRInteraction: Calculate LR interaction scores using spatial neighborhoods - runLARIS: Identify spatially-specific LR pairs and compute cell type interactions
- laris.tools.prepareLRInteraction(adata: AnnData, lr_df: DataFrame, number_nearest_neighbors: int = 10, use_rep_spatial: str = 'X_spatial') AnnData#
Calculate ligand-receptor integration scores using spatial neighborhood information.
This function computes diffused ligand-receptor interaction scores by considering the spatial context of each cell. It uses k-nearest neighbors to create a spatial neighborhood graph and calculates element-wise multiplication of diffused ligand and receptor expression levels.
- Parameters:
adata (AnnData) – Annotated data matrix containing gene expression and spatial coordinates. Must have .obsm[use_rep_spatial] with spatial coordinates.
lr_df (pd.DataFrame) – DataFrame containing ligand-receptor pairs with columns ‘ligand’ and ‘receptor’.
number_nearest_neighbors (int, default=10) – Number of nearest neighbors to consider for spatial diffusion.
use_rep_spatial (str, default='X_spatial') – Key in adata.obsm containing spatial coordinates.
- Returns:
New AnnData object containing ligand-receptor interaction scores. - .X: Sparse matrix of LR interaction scores (cells × LR pairs) - .var_names: Ligand-receptor pair names in format “ligand::receptor” - .obs: Cell metadata copied from input adata - .obsm: Spatial and other representations copied from input adata
- Return type:
AnnData
Examples
>>> import laris as la >>> import pandas as pd >>> >>> # Define ligand-receptor pairs >>> lr_df = pd.DataFrame({ ... 'ligand': ['Tgfb1', 'Vegfa'], ... 'receptor': ['Tgfbr1', 'Kdr'] ... }) >>> >>> # Calculate LR integration scores >>> lr_adata = la.tl.prepareLRInteraction(adata, lr_df) >>> print(lr_adata.shape) # (n_cells, n_lr_pairs)
- laris.tools.runLARIS(lr_adata: AnnData, adata: AnnData | None = None, use_rep: str = 'X_spatial', n_nearest_neighbors: int = 10, random_seed: int = 27, n_repeats: int = 3, mu: float = 1, sigma: float = 100, remove_lowly_expressed: bool = True, expressed_pct: float = 0.1, n_cells_expressed_threshold: int = 100, n_top_lr: int = 4000, by_celltype: bool = True, groupby: str = 'CellTypes', use_rep_spatial: str = 'X_spatial', number_nearest_neighbors: int = 10, mu_celltype: float = 100, expressed_pct_celltype: float = 0.1, remove_lowly_expressed_celltype: bool = True, mask_threshold: float = 1e-06, calculate_pvalues: bool = True, layer_celltype: str | None = None, n_neighbors_permutation: int = 30, n_permutations: int = 1000, chunk_size: int = 50000, prefilter_fdr: bool = True, prefilter_threshold: float = 0.0, score_threshold: float = 1e-06, spatial_weight: float = 1.0, use_conditional_pvalue: bool = False) DataFrame | Tuple[DataFrame, DataFrame]#
Identify spatially-specific ligand-receptor interactions using LARIS algorithm.
LARIS (Ligand And Receptor Interaction in Spatial transcriptomics) identifies LR pairs that show spatial specificity by comparing observed spatial correlation patterns with randomized null distributions. When by_celltype=True, the function also computes cell type-specific interaction scores with optional statistical testing.
This is the main analytical function of the LARIS package, providing:
Spatial Specificity Analysis: Identifies LR pairs that show non-random spatial co-localization patterns (higher scores = stronger spatial organization)
Cell type-specific Scores: Integrates spatial specificity with cell type expression specificity and spatial co-localization to identify which sender- receiver cell type pairs are communicating via which LR pairs
Statistical Testing: Optional permutation-based P values with FDR correction to identify statistically significant interactions
- Parameters:
lr_adata (AnnData) –
AnnData object containing LR interaction scores from prepareLRInteraction().
Required contents: - .X: Diffused LR scores (cells × LR pairs) - .var_names: LR pair names (“ligand::receptor”) - .obsm[use_rep]: Spatial coordinates or other representation
adata (AnnData, optional) –
Original annotated data matrix with gene expression and spatial information. Required when `by_celltype=True`.
Required contents (when by_celltype=True): - .obs[groupby]: Cell type annotations - .obsm[use_rep_spatial]: Spatial coordinates - .X or .layers[layer_celltype]: Gene expression
use_rep (str, default='X_spatial') – Key in lr_adata.obsm for coordinates to use in spatial specificity analysis. Typically spatial coordinates, but could be other representations.
n_nearest_neighbors (int, default=10) – Number of spatial neighbors for building the adjacency matrix in the spatial specificity analysis. Larger values capture broader spatial patterns.
random_seed (int, default=27) – Random seed for reproducibility of permutation tests.
n_repeats (int, default=3) – Number of random permutations to generate the null distribution for spatial specificity scoring. More repeats give more stable estimates.
mu (float, default=1) – Regularization parameter for spatial specificity. Higher values penalize interactions that look similar to random background more strongly.
sigma (float, default=100) – Bandwidth parameter for Gaussian distance kernel in adjacency matrix. Controls how quickly spatial weights decay with distance.
remove_lowly_expressed (bool, default=True) – Whether to filter out LR pairs with low expression before ranking.
expressed_pct (float, default=0.1) – Minimum fraction of cells expressing an LR pair (if remove_lowly_expressed=True).
n_cells_expressed_threshold (int, default=100) – Minimum number of cells expressing an LR pair for it to be ranked. Pairs below this threshold receive a penalty in ranking.
n_top_lr (int, default=4000) – Number of top-ranked spatially-specific LR pairs to return.
by_celltype (bool, default=True) – Whether to compute cell type-specific interaction scores. If False, only returns spatial specificity results (much faster). If True, adata must be provided.
by_celltype=True) (Cell Type Analysis Parameters (only used if)
-------------------------------------------------------------
groupby (str, default='CellTypes') – Column in adata.obs defining cell type groups.
use_rep_spatial (str, default='X_spatial') – Key in adata.obsm for spatial coordinates (for cell type analysis).
number_nearest_neighbors (int, default=10) – Number of spatial neighbors for cell type co-localization analysis.
mu_celltype (float, default=100) – Regularization parameter for COSG cell type specificity calculation. Higher values more strongly penalize broadly expressed genes.
expressed_pct_celltype (float, default=0.1) – Minimum expression fraction for cell type analysis.
remove_lowly_expressed_celltype (bool, default=True) – Whether to filter lowly expressed genes in cell type analysis.
mask_threshold (float, default=1e-6) – Numerical threshold for masking near-zero values.
calculate_pvalues (bool, default=True) – Whether to perform permutation testing for statistical significance. Set to False for faster exploratory analysis.
layer_celltype (str, optional) – Layer in adata.layers to use for expression data. If None, uses adata.X.
n_neighbors_permutation (int, default=30) – Number of similar interactions to use as background controls for permutation testing. These are selected based on similarity of diffused score profiles.
n_permutations (int, default=1000) – Number of permutations for statistical testing. Common values: - 1000: Quick testing - 5000: More precise p-values - 10000: Publication-quality precision
chunk_size (int, default=50000) – Number of interactions to process simultaneously during permutation. Larger values are faster but use more memory.
prefilter_fdr (bool, default=True) – If True, only test interactions with scores > prefilter_threshold for significance. Others get FDR p-value = 1.0. This reduces multiple testing burden and focuses power on high-scoring interactions.
prefilter_threshold (float, default=0.0) – Minimum interaction score for FDR testing (if prefilter_fdr=True).
score_threshold (float, default=1e-6) – Numerical precision threshold. Scores below this are set to exactly 0.0.
spatial_weight (float, default=1.0) – Exponent applied to spatial specificity scores. Controls influence on final interaction scores: - 0: Ignore spatial specificity - 1: Linear influence (default) - >1: Stronger emphasis on spatial specificity - <1: Weaker emphasis on spatial specificity
use_conditional_pvalue (bool, default=False) – Use conditional p-value calculation for zero-inflated data. Recommended for sparse datasets. When True: - Interactions with score=0 get p-value=1.0 - Non-zero scores compared only to non-zero background - Prevents spurious significance from sparse null distributions
- Returns:
- If by_celltype=False:
Single DataFrame with spatial specificity results:
Columns: - ‘ligand’: Ligand gene name - ‘receptor’: Receptor gene name - ‘score’: LARIS spatial specificity score - ‘Rank’: Rank (0 = highest scoring)
Index: LR pair names (“ligand::receptor”) Sorted by score (descending)
- If by_celltype=True:
Tuple of (laris_lr, celltype_results) where:
laris_lr: DataFrame as described above
celltype_results: DataFrame with cell type-specific scores:
Columns: - ‘sender’: Cell type sending the ligand - ‘receiver’: Cell type receiving the signal - ‘ligand’: Ligand gene name - ‘receptor’: Receptor gene name - ‘interaction_name’: “ligand::receptor” - ‘interaction_score’: Integrated LARIS score - ‘p_value’: Raw permutation p-value (if calculate_pvalues=True) - ‘p_value_fdr’: FDR-corrected p-value (if calculate_pvalues=True) - ‘nlog10_p_value_fdr’: -log10(FDR) for visualization
Sorted by interaction_score (descending)
- Return type:
pd.DataFrame or Tuple[pd.DataFrame, pd.DataFrame]
- Raises:
ValueError – If by_celltype=True but adata is not provided, or if required data is missing from adata or lr_adata.
ImportError – If required helper functions are not available.
Examples
Example 1: Quick spatial specificity analysis (no cell types)
>>> import laris as la >>> >>> # Prepare LR scores >>> lr_adata = la.tl.prepareLRInteraction(adata, lr_df) >>> >>> # Identify spatially-specific LR pairs only >>> laris_lr = la.tl.runLARIS( ... lr_adata, ... by_celltype=False, ... n_top_lr=1000 ... ) >>> >>> print(laris_lr.head())
Example 2: Full analysis with cell type-specific scores
>>> # Full LARIS analysis with cell types >>> laris_lr, celltype_results = la.tl.runLARIS( ... lr_adata, ... adata, ... by_celltype=True, ... groupby='cell_type', ... calculate_pvalues=True, ... n_permutations=5000 ... ) >>> >>> # Filter for significant interactions >>> sig_results = celltype_results[ ... celltype_results['p_value_fdr'] < 0.05 ... ] >>> >>> print(f"Found {len(sig_results)} significant interactions")
Example 3: Fast exploratory analysis (no p-values)
>>> laris_lr, celltype_results = la.tl.runLARIS( ... lr_adata, ... adata, ... by_celltype=True, ... calculate_pvalues=False # Much faster! ... )
Example 4: Conservative testing for sparse data
>>> laris_lr, celltype_results = la.tl.runLARIS( ... lr_adata, ... adata, ... by_celltype=True, ... use_conditional_pvalue=True, # Robust for sparse data ... n_permutations=5000, ... prefilter_fdr=True, ... prefilter_threshold=0.01 # Only test score > 0.01 ... )
Example 5: Emphasize spatial specificity
>>> laris_lr, celltype_results = la.tl.runLARIS( ... lr_adata, ... adata, ... by_celltype=True, ... spatial_weight=2.0 # Square the spatial scores ... )
See also
prepareLRInteractionPrepare LR scores (prerequisite for this function)
..autofunction:: prepareLRInteraction ..autofunction:: runLARIS