A copy number variant discovery pipeline for integrated genome-exome sequencing
National Institute of Mental HealthDescription
Copy number variants (CNVs) involve deletions and duplications of genomic segments spanning more than 50 basepairs and represent one of the most penetrant sources of pathogenic variants in neuropsychiatric disorders, with myriad impacts on many other human phenotypes as well. However, the relative impact of CNVs at the resolution of individual genes, exons, or functional categories, and especially across diverse global populations, has never been systematically assessed in neuropsychiatric disorders at scale. This omission can be attributed to the technical barriers in CNV discovery as well as the lack of large-scale, diverse neuropsychiatric cohorts. Traditional cytogenetic methods for CNV detection, such as chromosomal microarrays (CMA), are relatively low- resolution, and have largely precluded gene and exon resolution analyses. Recent advances in sequencing with whole exome (ES) and whole genome sequencing (GS) have dramatically improved our resolution, including the discovery of exon and sub-exon level CNVs. However, neither GS nor ES are perfect. While GS can interrogate the whole spectrum of CNVs across frequency and size, it is expensive. ES on the other hand, though affordable, can only query the rare coding portion of the genome for CNVs. Promisingly, the blended genome exome (BGE) sequencing approach has recently undergone heavy development and rapid adoption in a number of large-scale, diverse neuropsychiatric sequencing efforts, including in the Populations Underrepresented in Mental Illness Association Studies (PUMAS) project, NeuroDev, and Akili studies. BGE is composed of a high coverage ES (~30x) with a low coverage GS backbone (~2-3x), at a cost comparable to traditional exome sequencing. With this blend, BGE has delivered on marrying the affordability of ES with the full range of variant detection of GS when used to detect single nucleotide variants (SNVs) across the entire genome. Leveraging our expertise in computational methods development for CNV detection and association across GS and ES, we believe that in addition to SNVs, BGE is the perfect platform to capture the full range of CNVs across the genome at: a significantly improved resolution compared to CMA and ES; a significantly lower reference-bias compared to CMA; and a dramatically lower cost compared to GS. To achieve this, we will extend our GATK-gCNV pipeline for rare CNV detection in conjunction with our ancestry-aware SV imputation pipeline for use with BGE data. Preliminary results have already shown great promise. We will apply this pipeline to the more than 110,000 available BGE samples across PUMAS, NeuroDev, and Akili to generate a large-scale, diverse CNV callset. These variants will be made publicly available and can immediately be leveraged to significantly advance our understanding of the genetic architecture of neuropsychiatric conditions, especially in context of diverse genetic ancestry groups. Project Number: 1R21MH138855-01 | Fiscal Year: 2025 | NIH Institute/Center: National Institute of Mental Health (NIMH) | Principal Investigator: Harrison Brand | Institution: MASSACHUSETTS GENERAL HOSPITAL, BOSTON, MA | Award Amount: $206,250 | Activity Code: R21 | Study Section: Genetics of Health and Disease Study Section[GHD] View on NIH RePORTER: https://reporter.nih.gov/project-details/11037424
Interested in this grant?
Start a free 7-day trial to get match scores, save grants, and build your application with AI.
Grant Details
$206,250 - $206,250
Not specified
BOSTON, MA
View the application link
Start a free 7-day trial to open the original listing and funder website, save this grant, and track its deadline. Cancel anytime.
Start free trialWant to see how well this grant matches your organization?
Get Your Match Score