tka package

Subpackages

Submodules

tka.utils module

class tka.utils.SpheringNormalizer(controls)[source]

Bases: object

normalize(X)[source]
sphering_transform(X, lambda_, rotate=True)[source]
tka.utils.docs()[source]
tka.utils.load_l1000_ordered_feature_columns(gene_id)[source]
tka.utils.load_mobc_ordered_feature_columns()[source]
tka.utils.prepare_df_for_mobc_predictions(df_dmso, df_real, identifier_col='SMILES', grouping_col='', normalize=True)[source]
tka.utils.transform_l1000_ids(from_id, to_id, gene_ids, dataset_path='l1000_mapped.csv', ignore_missing=False) Dict[source]

Transforms L1000 gene IDs from one format to another.

Args:

from_id (str): The source probe type (“affyID”, “entrezID”, “ensemblID”). to_id (str): The target probe type (“affyID”, “entrezID”, “ensemblID”). gene_ids (list): List of L1000 gene IDs to transform. dataset_path (str): Path to the DataFrame containing L1000 gene IDs for each probe type. ignore_missing (bool): If set to True, it will not raise an error on missing or invalid probe IDs.

Returns:

dict: Original and transformed L1000 gene IDs as keys and values respectively.

Raises:

ValueError: If either from_id or to_id is not one of the allowed values. ValueError: If any of the gene IDs in the dataset is not within the scope of L1000.

tka.utils.transform_moshkov_outputs(identifier_col_vals: List[str], output: List[List], use_full_assay_names: bool = False) DataFrame[source]

Transform Moshkov outputs into a Pandas DataFrame.

Args:

identifier_col_vals (List[str]): List of SMILES strings corresponding to input data points (or any other identifiers). output (List[List[]]): List of lists containing output data (shape: X, 270). use_full_assay_names (bool, optional): Whether to use full assay names from the CSV. Defaults to False.

Returns:

DataFrame: Pandas DataFrame with SMILES as the first column and assay data columns.

Module contents