emergene.preprocessing package#

emergene.preprocessing.convertTopGeneDictToDF(data_dict, gene_list_as_string: bool = True)#

Converts the dictionary containing the top genes and their scores reported by EmerGene function into a wide-format DataFrame where each condition has two columns: “{condition}_Gene” and “{condition}_EG_score”.

Parameters:
  • data_dict (dict) – Dictionary where keys are conditions. - If gene_list_as_string=True: values are “gene:score” formatted strings. - If gene_list_as_string=False: values are DataFrames with ‘Gene’ and ‘EG_score’ columns.

  • gene_list_as_string (bool, optional (default=True)) –

    • If True, assumes values in data_dict are strings formatted as “gene:score,gene2:score2,…”.

    • If False, assumes values in data_dict are DataFrames with ‘Gene’ and ‘EG_score’ columns.

Returns:

A wide-format DataFrame where each condition has two columns: “{condition}_Gene” and “{condition}_EG_score”.

Return type:

pd.DataFrame

emergene.preprocessing.infog(adata, copy: bool = False, layer='raw', n_top_genes: int = 1000, key_added: str = 'infog', random_state: int = 10, trim: bool = True, verbosity: int = 1)#