Today, we are pleased to announce the incorporation of variant co-occurrence (inferred phasing) information in the gnomAD v2 browser. Phase refers to the genetic relationship between a pair of variants; that is, whether the variants are on the same copy of the gene (cis) or on different copies of the gene (trans). We are releasing inferred phasing data for all pairs of variants within a gene where both variants have a global allele frequency in gnomAD exomes <5% and are either coding, flanking intronic (from position -1 to -3 in acceptor sites, and +1 to +8 in donor sites) or in the 5’/3’ UTRs. This encompasses 20,921,100 pairs of variants across 19,685 genes. We envision that this data will be of tremendous help to the medical genetics community in identifying and interpreting co-occurring variants in the context of recessive conditions.
Posts by Laurent Francioli
News
The news page highlights new features, versions, or other major announcements. See our changelog for all changes to gnomAD, including minor ones.
gnomAD v3.0
Originally published on the MacArthur Lab blog.
We are thrilled to announce the release of gnomAD v3, a catalog containing 602M SNVs and 105M indels based on the whole-genome sequencing of 71,702 samples mapped to the GRCh38 build of the human reference genome. By increasing the number of whole genomes almost 5-fold from gnomAD v2.1, this release represents a massive leap in analysis power for anyone interested in non-coding regions of the genome or in coding regions poorly captured by exome sequencing.
In addition, gnomAD v3 adds new diversity – for instance, by almost doubling the number of African American samples we had in gnomAD v2 (exomes and genomes combined), and also including our first set of allele frequencies for the Amish population.
gnomAD v2.1
Originally published on the MacArthur Lab blog.
We are delighted to announce the release of gnomAD v2.1! This new release of gnomAD is based on the same underlying callset as gnomAD v2.0.2, but has the following improvements and new features:
- An awesome new browser
- Per-gene loss-of-function constraint
- Improved sample and variant filtering processes
- Allele frequencies in sub-continental populations in Europe and East Asia
- Allele frequencies computed for the following subsets of the data:
- Controls-only (no cases from common disease case/control studies)
- Samples not assessed for a neurological phenotype
- Samples that were not part of a cancer cohort
- Samples that are not part of the Trans-Omics for Precision Medicine (TOPMed)-BRAVO dataset
- New annotations for each variant
- Filtering allele frequency using Poisson 95% and 99% CI, per population
- Age histogram of heterozygous and homozygous carriers
gnomAD v2.1 comprises a total of 16mln SNVs and 1.2mln indels from 125,748 exomes, and 229mln SNVs and 33mln indels from 15,708 genomes. In addition to the 7 populations already present in gnomAD 2.0.2, this release now breaks down the non-Finnish Europeans and East Asian populations further into sub-populations. The population breakdown is detailed below.
The genome Aggregation Database (gnomAD)
Originally published on the MacArthur Lab blog.
Today, we are pleased to announce the formal release of the genome aggregation database (gnomAD). This release comprises two callsets: exome sequence data from 123,136 individuals and whole genome sequencing from 15,496 individuals. Importantly, in addition to an increased number of individuals of each of the populations in ExAC, we now additionally provide allele frequencies across over 5000 Ashkenazi Jewish (ASJ) individuals.