emergene.preprocessing package#
- emergene.preprocessing.convertTopGeneDictToDF(data_dict, gene_list_as_string: bool = True)#
Converts the dictionary containing the top genes and their scores reported by EmerGene function into a wide-format DataFrame where each condition has two columns: “{condition}_Gene” and “{condition}_EG_score”.
- Parameters:
data_dict (dict) – Dictionary where keys are conditions. - If gene_list_as_string=True: values are “gene:score” formatted strings. - If gene_list_as_string=False: values are DataFrames with ‘Gene’ and ‘EG_score’ columns.
gene_list_as_string (bool, optional (default=True)) –
If True, assumes values in data_dict are strings formatted as “gene:score,gene2:score2,…”.
If False, assumes values in data_dict are DataFrames with ‘Gene’ and ‘EG_score’ columns.
- Returns:
A wide-format DataFrame where each condition has two columns: “{condition}_Gene” and “{condition}_EG_score”.
- Return type:
pd.DataFrame
- emergene.preprocessing.infog(adata, copy: bool = False, layer='raw', n_top_genes: int = 1000, key_added: str = 'infog', random_state: int = 10, trim: bool = True, verbosity: int = 1)#