gnomAD v3.0

Originally published on the MacArthur Lab blog.

We are thrilled to announce the release of gnomAD v3, a catalog containing 602M SNVs and 105M indels based on the whole-genome sequencing of 71,702 samples mapped to the GRCh38 build of the human reference genome. By increasing the number of whole genomes almost 5-fold from gnomAD v2.1, this release represents a massive leap in analysis power for anyone interested in non-coding regions of the genome or in coding regions poorly captured by exome sequencing.

In addition, gnomAD v3 adds new diversity – for instance, by almost doubling the number of African-American samples we had in gnomAD v2 (exomes and genomes combined), and also including our first set of allele frequencies for the Amish population.

gnomAD v2.1

We are delighted to announce the release of gnomAD v2.1! This new release of gnomAD is based on the same underlying callset as gnomAD v2.0.2, but has the following improvements and new features:

  • An awesome new browser
  • Per-gene loss-of-function constraint
  • Improved sample and variant filtering processes
  • Allele frequencies in sub-continental populations in Europe and East Asia
  • Allele frequencies computed for the following subsets of the data:
    • Controls-only (no cases from common disease case/control studies)
    • Samples not assessed for a neurological phenotype
    • Samples that were not part of a cancer cohort
    • Samples that are not part of the Trans-Omics for Precision Medicine (TOPMed)-BRAVO dataset
  • New annotations for each variant
    • Filtering allele frequency using Poisson 95% and 99% CI, per population
    • Age histogram of heterozygous and homozygous carriers

gnomAD v2.1 comprises a total of 16mln SNVs and 1.2mln indels from 125,748 exomes, and 229mln SNVs and 33mln indels from 15,708 genomes. In addition to the 7 populations already present in gnomAD 2.0.2, this release now breaks down the non-Finnish Europeans and East Asian populations further into sub-populations. The population breakdown is detailed below.

The genome Aggregation Database (gnomAD)

Today, we are pleased to announce the formal release of the genome aggregation database (gnomAD). This release comprises two callsets: exome sequence data from 123,136 individuals and whole genome sequencing from 15,496 individuals. Importantly, in addition to an increased number of individuals of each of the populations in ExAC, we now additionally provide allele frequencies across over 5000 Ashkenazi Jewish (ASJ) individuals.