Skip to content

rank_response_ratio_summarize

Processes a DataFrame to compute and summarize rank response ratios.

This function applies several processing steps on the input DataFrame, including labeling responsive genes, calculating random expectations, binning by binding rank, and computing rank responses. It returns three DataFrames containing various processed results.

Parameters:

Name Type Description Default
df DataFrame

DataFrame to process.

required
effect_expression_thres float

Threshold for effect expression. Defaults to 0.

0
p_expression_thres float

Threshold for expression p-value. Defaults to 0.05.

0.05
normalize bool

Whether to normalize the data. Defaults to False.

False
bin_size int

Size of each bin for binding rank. Defaults to 5.

5

Returns:

Name Type Description
tuple (DataFrame, DataFrame, DataFrame)

A tuple containing three DataFrames: 1. The input DataFrame with additional processing, 2. A DataFrame of random expectations, 3. A DataFrame of rank response calculations.

Example

test_df = pd.DataFrame({‘gene_id’: [‘gene1’, ‘gene2’, ‘gene3’], ‘effect_expression’: [0.5, -0.7, 1.2], ‘p_expression’: [0.04, 0.07, 0.01], ‘binding_signal’: [10, 20, 30]}) df, random_expectation_df, rank_response_df = … rank_response_ratio_summarize(test_df) df.shape (3, x) # x depends on the processing steps random_expectation_df.shape (y, z) # y and z depend on the structure of random expectations rank_response_df.shape (a, b) # a and b depend on the structure of rank response calculations

Source code in callingcardstools/Analysis/yeast/rank_response/rank_response_ratio_summarize.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
def rank_response_ratio_summarize(
        df: pd.DataFrame,
        effect_expression_thres: float = 0,
        p_expression_thres: float = 0.05,
        normalize: bool = False,
        bin_size: int = 5) -> (pd.DataFrame, pd.DataFrame, pd.DataFrame):
    """
    Processes a DataFrame to compute and summarize rank response ratios.

    This function applies several processing steps on the input DataFrame,
    including labeling responsive genes, calculating random expectations,
    binning by binding rank, and computing rank responses. It returns three
    DataFrames containing various processed results.

    Args:
        df (pd.DataFrame): DataFrame to process.
        effect_expression_thres (float, optional): Threshold for effect
            expression. Defaults to 0.
        p_expression_thres (float, optional): Threshold for expression p-value.
            Defaults to 0.05.
        normalize (bool, optional): Whether to normalize the data. Defaults to
            False.
        bin_size (int, optional): Size of each bin for binding rank. Defaults
            to 5.

    Returns:
        tuple: A tuple containing three DataFrames:
               1. The input DataFrame with additional processing,
               2. A DataFrame of random expectations,
               3. A DataFrame of rank response calculations.

    Example:
        >>> test_df = pd.DataFrame({'gene_id': ['gene1', 'gene2', 'gene3'],
                                    'effect_expression': [0.5, -0.7, 1.2],
                                    'p_expression': [0.04, 0.07, 0.01],
                                    'binding_signal': [10, 20, 30]})
        >>> df, random_expectation_df, rank_response_df = \
        ...                  rank_response_ratio_summarize(test_df)
        >>> df.shape
        (3, x)  # x depends on the processing steps
        >>> random_expectation_df.shape
        (y, z)  # y and z depend on the structure of random expectations
        >>> rank_response_df.shape
        (a, b)  # a and b depend on the structure of rank response calculations
    """
    df_expression_labeled = label_responsive_genes(
        df,
        effect_expression_thres,
        p_expression_thres, normalize)

    random_expectation_df = calculate_random_expectation(df_expression_labeled)

    df_expression_labeled_binding_ranked = \
        bin_by_binding_rank(df_expression_labeled, bin_size)

    df_expression_labeled_binding_ranked_with_random = \
        df_expression_labeled_binding_ranked\
        .assign(random=float(random_expectation_df['random']))

    rank_response_df = compute_rank_response(
        df_expression_labeled_binding_ranked_with_random)

    return (df_expression_labeled_binding_ranked_with_random,
            random_expectation_df,
            rank_response_df)