Supplementary Material for: Meta-Analysis for Penalized Regression Methods with Multi-Cohort Genome-Wide Association Studies

Objective: Penalized regression has been successfully applied in genome-wide association studies. While meta-analysis is often conducted to increase power and protect patients' confidentiality, methods for meta-analyzing results of penalized regression in multi-cohort setting are still under development. Methods: We propose to use a data-splitting method to obtain valid p values (or equivalently, coefficient estimates and standard errors) for meta-analysis across multiple cohorts. We examine two ways of splitting data in multi-cohort setting and propose three methods to conduct meta-analysis based on p values. We compare the three meta-analysis methods to mega-analysis, which consists of pooling individual level data. We also apply our proposed meta-analysis approaches to the Framingham Heart Study data, where we divide the original dataset into four parts to create a multi-cohort scenario. Results: The simulations suggest that splitting cohorts has better performance than splitting data within each cohort. The real data application also shows that this method provides results that are similar to the mega-analysis. Conclusion: After comparing the three methods that we proposed to conduct meta-analysis, we recommend splitting cohorts rather than datasets to obtain valid p values for meta-analysis of results from penalized regression in multi-cohort setting.