laris.preprocessing package#

LARIS Preprocessing Module (laris.pp)

Preprocessing utilities and helper functions for LARIS analysis.

This module previously contained: - Matrix similarity calculations - Data selection and ranking utilities - Matrix manipulation helpers

Most preprocessing functions have been moved to internal utilities in laris.tl._utils as they are primarily used internally by the tools module. Advanced users who need these functions can access them through laris.tl._utils.

For typical LARIS workflows, users should rely on the main functions in laris.tl: - laris.tl.prepareLRInteraction() - laris.tl.runLARIS()

Example usage:
>>> import laris as la
>>>
>>> # Standard workflow - no preprocessing needed
>>> lr_adata = la.tl.prepareLRInteraction(adata, lr_df)
>>> laris_results, celltype_results = la.tl.runLARIS(lr_adata, adata)
laris.preprocessing._pairwise_row_multiply(sparse_matrix: csr_matrix, cell_types: List[str], delimiter: str = '::') Tuple[csr_matrix, ndarray]#

Calculate element-wise multiplication between all pairs of rows in a sparse matrix.

This helper function computes pairwise products between all rows of a matrix, including self-interactions. This is useful for modeling sender-receiver cell type interactions in spatial transcriptomics.

Parameters:
  • sparse_matrix (scipy.sparse.csr_matrix) – Input sparse matrix of shape (N, M) where N is the number of cell types.

  • cell_types (List[str]) – Array of cell type names, length N.

  • delimiter (str, default="::") – Delimiter to use when joining cell type pair names.

Returns:

  • result (scipy.sparse.csr_matrix) – Sparse matrix of shape (N*N, M) containing all pairwise multiplications.

  • row_names (np.ndarray) – Array of strings representing the cell type pairs for each row, in the format “cell_type_i::cell_type_j”.

Raises:
  • ValueError – If length of cell_types doesn’t match number of rows in sparse_matrix.

  • TypeError – If sparse_matrix is not a sparse matrix.

Examples

>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> import laris as la
>>>
>>> # Create a simple sparse matrix (3 cell types × 5 cells)
>>> data = csr_matrix([[1, 0, 2, 0, 1],
...                    [0, 3, 0, 1, 0],
...                    [2, 0, 1, 0, 3]])
>>> cell_types = ['TypeA', 'TypeB', 'TypeC']
>>>
>>> # Compute pairwise products
>>> result, names = la.tl._utils._pairwise_row_multiply(data, cell_types)
>>> print(result.shape)  # (9, 5) for 3×3 pairs
>>> print(names[:3])     # ['TypeA::TypeA', 'TypeA::TypeB', 'TypeA::TypeC']

Notes

This function is used internally to model sender-receiver cell type interactions. The pairwise products capture the joint signal patterns between different cell types in spatial neighborhoods.

For N cell types, this generates N² combinations including self-interactions (TypeA::TypeA, TypeA::TypeB, etc.).

laris.preprocessing._rowwise_cosine_similarity(A: csr_matrix, B: csr_matrix) ndarray#

Compute the cosine similarity between corresponding rows of matrices A and B.

This function efficiently calculates row-wise cosine similarity for both dense and sparse matrices. It handles zero-norm rows gracefully by returning zero similarity for those cases.

Parameters:
  • A (np.ndarray or scipy.sparse.csr_matrix) – A 2D array or CSR sparse matrix with shape (n, m).

  • B (np.ndarray or scipy.sparse.csr_matrix) – A 2D array or CSR sparse matrix with shape (n, m).

Returns:

A 1D array of shape (n,) containing the cosine similarity between corresponding rows of A and B.

Return type:

np.ndarray

Raises:

ValueError – If matrices A and B have different shapes.

Examples

>>> import numpy as np
>>> import laris as la
>>>
>>> # Dense matrices
>>> A = np.array([[1, 2, 3], [4, 5, 6]])
>>> B = np.array([[7, 8, 9], [1, 2, 3]])
>>> similarities = la.tl._utils._rowwise_cosine_similarity(A, B)
>>> print(similarities)
>>> # Sparse matrices
>>> from scipy.sparse import csr_matrix
>>> A_sparse = csr_matrix(A)
>>> B_sparse = csr_matrix(B)
>>> similarities_sparse = la.tl._utils._rowwise_cosine_similarity(A_sparse, B_sparse)
laris.preprocessing._select_top_n(scores: ndarray, n_top: int) ndarray#

Select indices of top n highest-scoring elements.

This is an internal helper function that efficiently selects the top n elements from an array using partial sorting (argpartition), which is faster than full sorting for large arrays.

Parameters:
  • scores (np.ndarray) – 1D array of scores.

  • n_top (int) – Number of top elements to select.

Returns:

Array of indices corresponding to the top n scores, sorted in descending order by score.

Return type:

np.ndarray

Examples

>>> import numpy as np
>>> import laris as la
>>>
>>> scores = np.array([0.1, 0.9, 0.3, 0.7, 0.5])
>>> top_indices = la.tl._utils._select_top_n(scores, 3)
>>> print(top_indices)  # [1, 3, 4] (indices of 0.9, 0.7, 0.5)