laris.preprocessing package#
LARIS Preprocessing Module (laris.pp)
Preprocessing utilities and helper functions for LARIS analysis.
This module previously contained: - Matrix similarity calculations - Data selection and ranking utilities - Matrix manipulation helpers
Most preprocessing functions have been moved to internal utilities in laris.tl._utils as they are primarily used internally by the tools module. Advanced users who need these functions can access them through laris.tl._utils.
For typical LARIS workflows, users should rely on the main functions in laris.tl: - laris.tl.prepareLRInteraction() - laris.tl.runLARIS()
- Example usage:
>>> import laris as la >>> >>> # Standard workflow - no preprocessing needed >>> lr_adata = la.tl.prepareLRInteraction(adata, lr_df) >>> laris_results, celltype_results = la.tl.runLARIS(lr_adata, adata)
- laris.preprocessing._pairwise_row_multiply(sparse_matrix: csr_matrix, cell_types: List[str], delimiter: str = '::') Tuple[csr_matrix, ndarray]#
Calculate element-wise multiplication between all pairs of rows in a sparse matrix.
This helper function computes pairwise products between all rows of a matrix, including self-interactions. This is useful for modeling sender-receiver cell type interactions in spatial transcriptomics.
- Parameters:
sparse_matrix (scipy.sparse.csr_matrix) – Input sparse matrix of shape (N, M) where N is the number of cell types.
cell_types (List[str]) – Array of cell type names, length N.
delimiter (str, default="::") – Delimiter to use when joining cell type pair names.
- Returns:
result (scipy.sparse.csr_matrix) – Sparse matrix of shape (N*N, M) containing all pairwise multiplications.
row_names (np.ndarray) – Array of strings representing the cell type pairs for each row, in the format “cell_type_i::cell_type_j”.
- Raises:
ValueError – If length of cell_types doesn’t match number of rows in sparse_matrix.
TypeError – If sparse_matrix is not a sparse matrix.
Examples
>>> import numpy as np >>> from scipy.sparse import csr_matrix >>> import laris as la >>> >>> # Create a simple sparse matrix (3 cell types × 5 cells) >>> data = csr_matrix([[1, 0, 2, 0, 1], ... [0, 3, 0, 1, 0], ... [2, 0, 1, 0, 3]]) >>> cell_types = ['TypeA', 'TypeB', 'TypeC'] >>> >>> # Compute pairwise products >>> result, names = la.tl._utils._pairwise_row_multiply(data, cell_types) >>> print(result.shape) # (9, 5) for 3×3 pairs >>> print(names[:3]) # ['TypeA::TypeA', 'TypeA::TypeB', 'TypeA::TypeC']
Notes
This function is used internally to model sender-receiver cell type interactions. The pairwise products capture the joint signal patterns between different cell types in spatial neighborhoods.
For N cell types, this generates N² combinations including self-interactions (TypeA::TypeA, TypeA::TypeB, etc.).
- laris.preprocessing._rowwise_cosine_similarity(A: csr_matrix, B: csr_matrix) ndarray#
Compute the cosine similarity between corresponding rows of matrices A and B.
This function efficiently calculates row-wise cosine similarity for both dense and sparse matrices. It handles zero-norm rows gracefully by returning zero similarity for those cases.
- Parameters:
A (np.ndarray or scipy.sparse.csr_matrix) – A 2D array or CSR sparse matrix with shape (n, m).
B (np.ndarray or scipy.sparse.csr_matrix) – A 2D array or CSR sparse matrix with shape (n, m).
- Returns:
A 1D array of shape (n,) containing the cosine similarity between corresponding rows of A and B.
- Return type:
np.ndarray
- Raises:
ValueError – If matrices A and B have different shapes.
Examples
>>> import numpy as np >>> import laris as la >>> >>> # Dense matrices >>> A = np.array([[1, 2, 3], [4, 5, 6]]) >>> B = np.array([[7, 8, 9], [1, 2, 3]]) >>> similarities = la.tl._utils._rowwise_cosine_similarity(A, B) >>> print(similarities)
>>> # Sparse matrices >>> from scipy.sparse import csr_matrix >>> A_sparse = csr_matrix(A) >>> B_sparse = csr_matrix(B) >>> similarities_sparse = la.tl._utils._rowwise_cosine_similarity(A_sparse, B_sparse)
- laris.preprocessing._select_top_n(scores: ndarray, n_top: int) ndarray#
Select indices of top n highest-scoring elements.
This is an internal helper function that efficiently selects the top n elements from an array using partial sorting (argpartition), which is faster than full sorting for large arrays.
- Parameters:
scores (np.ndarray) – 1D array of scores.
n_top (int) – Number of top elements to select.
- Returns:
Array of indices corresponding to the top n scores, sorted in descending order by score.
- Return type:
np.ndarray
Examples
>>> import numpy as np >>> import laris as la >>> >>> scores = np.array([0.1, 0.9, 0.3, 0.7, 0.5]) >>> top_indices = la.tl._utils._select_top_n(scores, 3) >>> print(top_indices) # [1, 3, 4] (indices of 0.9, 0.7, 0.5)