The function is used by runGprimeAnalysis_local
to
calculate the G statisic G is defined by the equation:
$$G = 2*\sum_{i=1}^{4}n_{i}*ln\frac{obs(n_i)}{exp(n_i)}$$
Where for each SNP, \(n_i\) from i = 1 to 4 corresponds to the reference
and alternate allele depths for each bulk, as described in the following
table:
Allele | High Bulk | Low Bulk |
Reference | \(n_1\) | \(n_2\) |
Alternate | \(n_3\) | \(n_4\) |
...and \(obs(n_i)\) are the observed allele depths as described in the data frame. Method 1 calculates the G statistic using expected values assuming read depth is equal for all alleles in both bulks: $$exp(n_1) = ((n_1 + n_2)*(n_1 + n_3))/(n_1 + n_2 + n_3 + n_4)$$ $$exp(n_2) = ((n_2 + n_1)*(n_2 + n_4))/(n_1 + n_2 + n_3 + n_4)$$ etc...
Arguments
- LowRef
A vector of the reference allele depth in the low bulk
- HighRef
A vector of the reference allele depth in the high bulk
- LowAlt
A vector of the alternate allele depth in the low bulk
- HighAlt
A vector of the alternate allele depth in the high bulk
See also
The Statistics
of Bulk Segregant Analysis Using Next Generation Sequencing
tricubeStat
for G prime calculation