Basic correlated meta-analysis with corrmeta
Woo Jung
vignette.Rmd
Introduction
Meta-analysis is a common tool for integrating findings across multiple OMIC scans, particularly when investigators have limited access to only summary results from each study. Traditional meta-analysis techniques often overlook the problem of hidden non-independencies among study elements, such as overlapping or related subjects, leading to potential biases and inaccuracies in the aggregated results. The corrmeta package presents a solution for conducting correlated meta-analysis, a critical tool for researchers dealing with the complexities of data dependencies in studies with potentially related subjects (Province 2005), (Borecki and Province 2008), (Province and Borecki 2013). This vignette will cover basic usage of the corrmeta package.
Installation
Bioconductor installation (Recommended)
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("corrmeta")
Try this first before other installation methods.
Install from Github
BiocManager::install("wsjung/corrmeta")
Simple example
Preprocessing
Load data
This loads trt1
, trt2
, and
trt3
which are short, simulated SNP-trait association
datasets. Note that although the examples are working on SNP datasets,
corrmeta
works for any common OMIC unit of inference across each input dataset.
corrmeta
requires that the input is a single dataframe where the OMIC units of
inference are under column markname
and each scan has its
own column.
Correlated meta-analysis
With the preprocessing step, we can now run the function
tetracorr
which takes the input dataframe data
and varlist
the list of scans which are column names in
data
. Briefly, tetracorr
computes the z-scores
of the input p-values using the complement probit transformation then
calculates the polychoric correlations.
tc <- tetracorr(snp_example, varlist)
tc
## $sigma
## # A tibble: 3 × 4
## row trt1 trt2 trt3
## <chr> <dbl> <dbl> <dbl>
## 1 trt1 1 0.215 -0.215
## 2 trt2 0.215 1 0.127
## 3 trt3 -0.215 0.127 1
##
## $sum_sigma
## [1] 3.253552
tetracorr
returns an object with two elements.
sigma
is the table of tetrachoric correlation coefficients
between each pair of the input scans. sum_sigma
is the sum
of all pair-wise tetrachoric corerlation coefficients.
Fisher’s method
The final correlated meta-analysis p-value can be computed using the
Fisher’s method. fishp
takes the input dataframe, list of
scans, and the outputs from tetracorr
.
fishp(snp_example, varlist, tc$sigma, tc$sum_sigma)
## markname trt1 trt2 trt3 num_obs sum_sigma_var sum_chisq
## 1 c01b000015585s 0.35580 0.7356 0.69200 3 3.253552 3.417249
## 2 c01b000015644s 0.58850 0.4539 0.71640 3 3.253552 3.307147
## 3 c01b000015647s 0.18840 0.3029 0.21110 3 3.253552 8.837928
## 4 c01b000015717s 0.99820 0.2474 0.20290 3 3.253552 5.987185
## 5 c01b000015721s 0.74750 0.2206 0.19540 3 3.253552 6.870263
## 6 c01b000016805s 0.08051 0.1532 0.79100 3 3.253552 9.259684
## 7 c01b000016809s 0.07062 0.2896 0.85790 3 3.253552 8.085928
## 8 c01b000016856s 0.74300 0.5204 0.31930 3 3.253552 4.183682
## 9 c01b000016946s 0.77860 0.6758 0.80840 3 3.253552 1.709628
## 10 c01b000016963s 0.82460 0.7960 0.30990 3 3.253552 3.185037
## 11 c01b000016968s 0.13200 0.5866 0.25170 3 3.253552 7.875766
## 12 c01b000016977s 0.82080 0.7761 0.21520 3 3.253552 3.974274
## 13 c01b000016993s 0.18290 0.6209 0.06663 3 3.253552 9.768003
## 14 c01b000017041s 0.76820 0.8736 0.54980 3 3.253552 1.994077
## 15 c01b000017101s 0.24760 0.3189 0.10090 3 3.253552 9.664888
## 16 c01b000017147s 0.03534 0.9412 0.99310 3 3.253552 6.820527
## 17 c01b000017181s 0.84080 0.7264 0.76440 3 3.253552 1.523440
## 18 c01b000017375s 0.97000 0.2214 0.03283 3 3.253552 9.909312
## 19 c01b000017379s 0.56130 0.5311 0.05570 3 3.253552 8.196160
## sum_z pvalue meta_z meta_p meta_nlog10p
## 1 -0.7616582 0.7549448 -0.4222612 0.66358283 0.17810486
## 2 -0.6800542 0.7694257 -0.3770202 0.64692071 0.18914894
## 3 2.2024960 0.1829002 1.2210578 0.11103206 0.95455159
## 4 -1.3972360 0.4246272 -0.7746239 0.78071902 0.10750524
## 5 0.9616926 0.3330121 0.5331598 0.29696150 0.52729986
## 6 1.6145585 0.1594917 0.8951069 0.18536498 0.73197231
## 7 0.9548107 0.2318753 0.5293445 0.29828326 0.52537112
## 8 -0.2341224 0.6518348 -0.1297968 0.55163641 0.25834708
## 9 -2.0954750 0.9443755 -1.1617257 0.87732654 0.05683873
## 10 -1.2643232 0.7852901 -0.7009374 0.75832894 0.12014237
## 11 1.5673289 0.2473471 0.8689229 0.19244465 0.71569416
## 12 -0.8889986 0.6801580 -0.4928584 0.68894370 0.16181627
## 13 2.0978928 0.1347681 1.1630661 0.12240135 0.91221381
## 14 -2.0016627 0.9202425 -1.1097164 0.86643938 0.06226182
## 15 2.4292787 0.1394923 1.3467855 0.08902466 1.05048968
## 16 -2.2198268 0.3377643 -1.2306660 0.89077610 0.05023145
## 17 -2.3202403 0.9579223 -1.2863350 0.90083691 0.04535383
## 18 0.7274177 0.1285234 0.4032784 0.34337171 0.46423549
## 19 1.3596307 0.2240816 0.7537756 0.22549200 0.64686886
Example with missing samples
This example shows corrmeta’s capability in dealing with missing samples across the scans. This is possible by leveraging the basic property of the MVN distribution that every subdimensional space is also MVN distributed (learn more at (Province and Borecki 2013)). The example datasets are the same as above, but with some samples removed.
Preprocessing
## markname trt1 trt2 trt3
## 1 c01b000015585s 0.35580 NA NA
## 2 c01b000015644s 0.58850 0.4539 NA
## 3 c01b000015647s 0.18840 0.3029 0.21110
## 4 c01b000015717s 0.99820 0.2474 0.20290
## 5 c01b000015721s 0.74750 0.2206 0.19540
## 6 c01b000016805s 0.08051 0.1532 0.79100
## 7 c01b000016809s 0.07062 0.2896 0.85790
## 8 c01b000016856s 0.74300 0.5204 0.31930
## 9 c01b000016946s 0.77860 0.6758 0.80840
## 10 c01b000016963s 0.82460 0.7960 0.30990
## 11 c01b000016968s 0.13200 0.5866 0.25170
## 12 c01b000016977s 0.82080 0.7761 0.21520
## 13 c01b000016993s 0.18290 0.6209 0.06663
## 14 c01b000017041s 0.76820 0.8736 0.54980
## 15 c01b000017101s 0.24760 0.3189 0.10090
## 16 c01b000017147s 0.03534 0.9412 0.99310
## 17 c01b000017181s 0.84080 0.7264 0.76440
## 18 c01b000017375s 0.97000 0.2214 0.03283
## 19 c01b000017379s 0.56130 0.5311 0.05570
We can see that trt2_missing
is missing
c01b000015585s
and trt3_missing
is missing
both c01b000015585s
and c01b000015644s
.
Correlated meta-analysis
tc <- tetracorr(snp_example_missing, varlist)
tc
## $sigma
## # A tibble: 3 × 4
## row trt1 trt2 trt3
## <chr> <dbl> <dbl> <dbl>
## 1 trt1 1 0.319 -0.212
## 2 trt2 0.319 1 0.192
## 3 trt3 -0.212 0.192 1
##
## $sum_sigma
## [1] 3.597483
Fisher’s method
fishp(snp_example_missing, varlist, tc$sigma, tc$sum_sigma)
## markname trt1 trt2 trt3 num_obs sum_sigma_var sum_chisq
## 1 c01b000015585s 0.35580 NA NA 1 1.000000 2.066773
## 2 c01b000015644s 0.58850 0.4539 NA 2 2.637578 2.640113
## 3 c01b000015647s 0.18840 0.3029 0.21110 3 3.597483 8.837928
## 4 c01b000015717s 0.99820 0.2474 0.20290 3 3.597483 5.987185
## 5 c01b000015721s 0.74750 0.2206 0.19540 3 3.597483 6.870263
## 6 c01b000016805s 0.08051 0.1532 0.79100 3 3.597483 9.259684
## 7 c01b000016809s 0.07062 0.2896 0.85790 3 3.597483 8.085928
## 8 c01b000016856s 0.74300 0.5204 0.31930 3 3.597483 4.183682
## 9 c01b000016946s 0.77860 0.6758 0.80840 3 3.597483 1.709628
## 10 c01b000016963s 0.82460 0.7960 0.30990 3 3.597483 3.185037
## 11 c01b000016968s 0.13200 0.5866 0.25170 3 3.597483 7.875766
## 12 c01b000016977s 0.82080 0.7761 0.21520 3 3.597483 3.974274
## 13 c01b000016993s 0.18290 0.6209 0.06663 3 3.597483 9.768003
## 14 c01b000017041s 0.76820 0.8736 0.54980 3 3.597483 1.994077
## 15 c01b000017101s 0.24760 0.3189 0.10090 3 3.597483 9.664888
## 16 c01b000017147s 0.03534 0.9412 0.99310 3 3.597483 6.820527
## 17 c01b000017181s 0.84080 0.7264 0.76440 3 3.597483 1.523440
## 18 c01b000017375s 0.97000 0.2214 0.03283 3 3.597483 9.909312
## 19 c01b000017379s 0.56130 0.5311 0.05570 3 3.597483 8.196160
## sum_z pvalue meta_z meta_p meta_nlog10p
## 1 0.3697081 0.9134561 0.36970809 0.3558000 0.44879406
## 2 -0.1078742 0.8524690 -0.06642244 0.5264792 0.27861874
## 3 2.2024960 0.1829002 1.16122324 0.1227756 0.91088807
## 4 -1.3972360 0.4246272 -0.73666555 0.7693371 0.11388331
## 5 0.9616926 0.3330121 0.50703373 0.3060656 0.51418552
## 6 1.6145585 0.1594917 0.85124462 0.1973167 0.70483607
## 7 0.9548107 0.2318753 0.50340539 0.3073396 0.51238142
## 8 -0.2341224 0.6518348 -0.12343648 0.5491193 0.26033332
## 9 -2.0954750 0.9443755 -1.10479851 0.8653765 0.06279488
## 10 -1.2643232 0.7852901 -0.66658985 0.7474829 0.12639873
## 11 1.5673289 0.2473471 0.82634374 0.2043046 0.68972193
## 12 -0.8889986 0.6801580 -0.46870728 0.6803606 0.16726087
## 13 2.0978928 0.1347681 1.10607323 0.1343474 0.87177070
## 14 -2.0016627 0.9202425 -1.05533782 0.8543646 0.06835677
## 15 2.4292787 0.1394923 1.28079001 0.1001337 0.99941966
## 16 -2.2198268 0.3377643 -1.17036060 0.8790721 0.05597552
## 17 -2.3202403 0.9579223 -1.22330167 0.8893921 0.05090673
## 18 0.7274177 0.1285234 0.38351686 0.3506683 0.45510351
## 19 1.3596307 0.2240816 0.71683887 0.2367368 0.62573429
Session info
## R version 4.3.2 (2023-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## time zone: UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] corrmeta_0.99.0 dplyr_1.1.4 magrittr_2.0.3 BiocStyle_2.30.0
##
## loaded via a namespace (and not attached):
## [1] jsonlite_1.8.8 compiler_4.3.2 BiocManager_1.30.22
## [4] tidyselect_1.2.0 stringr_1.5.1 parallel_4.3.2
## [7] tidyr_1.3.1 jquerylib_0.1.4 systemfonts_1.0.5
## [10] textshaping_0.3.7 yaml_2.3.8 fastmap_1.1.1
## [13] R6_2.5.1 generics_0.1.3 knitr_1.45
## [16] admisc_0.34 tibble_3.2.1 bookdown_0.37
## [19] desc_1.4.3 bslib_0.6.1 pillar_1.9.0
## [22] rlang_1.1.3 utf8_1.2.4 cachem_1.0.8
## [25] stringi_1.8.3 xfun_0.42 fs_1.6.3
## [28] sass_0.4.8 memoise_2.0.1 cli_3.6.2
## [31] withr_3.0.0 pkgdown_2.0.7 digest_0.6.34
## [34] mvtnorm_1.2-4 lifecycle_1.0.4 vctrs_0.6.5
## [37] evaluate_0.23 glue_1.7.0 ragg_1.2.7
## [40] fansi_1.0.6 polycor_0.8-1 rmarkdown_2.25
## [43] purrr_1.0.2 tools_4.3.2 pkgconfig_2.0.3
## [46] htmltools_0.5.7