R/top_500_pca_for_sex_mislabels.R
top_500_pca_for_sex_mislabels.Rd
it happens that the variation int he top 500 most variable genes is driven the expression of genes related to biological sex. Because of this, we can use the top two PCs to infer the sex of the samples. The separation of samples is clearer in the PCs than it is when looking directly at sex chromosome experssion, it turns out.
top_500_pca_for_sex_mislabels(
vst,
slope,
intercept = 0,
inferred_sex_direction = "up"
)
a vst(dds) object
the sex groups separate nicely when plotted on PC1 and PC2 such
that they can be separated with a line. the slope
controls the slope
of that line. see also intercept
see also slope
. This controls the intercept of the
line used to separate the sex labels
one of c("up", "down"). Default is 'up', an arbitrary designation. This will 'flip' the labels, and is used b/c the 'direction' of the PCs is arbitrary.
a list which has the data and the plot