Posts by Grace Tiao


The news page highlights new features, versions, or other major announcements. See our changelog for all changes to gnomAD, including minor ones.

Local Ancestry Inference for Latino/Admixed American Samples in gnomAD

We are happy to share, for the first time, local ancestry-informed frequency data for 14,804,207 bi-allelic SNPs within the Latino/Admixed American sample of gnomAD v3.1 (n=7,612). This initial release of gnomAD local ancestry inference (LAI) data contains estimated alternate allele counts, allele numbers, and allele frequencies partitioned by continental ancestries for variants in the Latino/Admixed American population. The samples classified as Latino/Admixed American by gnomAD’s global ancestry inference have highly heterogeneous, admixed ancestry,1 and local ancestry inference resolves these individuals’ admixed ancestry into Amerindigenous, African, and European haplotypes.

gnomAD v3.1 New Content, Methods, Annotations, and Data Availability

We’re proud to announce the gnomAD v3.1 release of 759,302,267 short nuclear variants (644,267,978 passing variant quality filters) observed in 76,156 genome samples.

In this release, we have included more than 3,000 new samples specifically chosen to increase the ancestral diversity of the resource. As a result, this is the first release for which we have a designated population label for samples of Middle Eastern ancestry, and we are thrilled to be able to include these in the following population breakdown for the v3.1 release:

Population Description Genomes
afr African/African American 20,744
ami Amish 456
amr Latino/Admixed American 7,647
asj Ashkenazi Jewish 1,736
eas East Asian 2,604
fin Finnish 5,316
nfe Non-Finnish European 34,029
mid Middle Eastern 158
sas South Asian 2,419
oth Other (population not assigned) 1,047

Open access to gnomAD data on multiple cloud providers

We’re very pleased to announce that gnomAD data is now available as a free public dataset on Amazon Web Services, Microsoft Azure, and Google Cloud. Researchers may download and read gnomAD data for free in all regions from all three cloud providers.

From our beginnings as a project, we have been committed to making gnomAD data as free and accessible to the world as possible. Working in partnership with Amazon, Microsoft, and Google’s public data hosting programs, we have expanded the number of cloud platforms on which gnomAD data is fully free to access. Researchers will no longer need to maintain personal copies of gnomAD data on these cloud platforms, eliminating long-term storage costs as well as transfer fees associated with copying gnomAD data into private cloud storage.

gnomAD v2.1

Originally published on the MacArthur Lab blog.

We are delighted to announce the release of gnomAD v2.1! This new release of gnomAD is based on the same underlying callset as gnomAD v2.0.2, but has the following improvements and new features:

  • An awesome new browser
  • Per-gene loss-of-function constraint
  • Improved sample and variant filtering processes
  • Allele frequencies in sub-continental populations in Europe and East Asia
  • Allele frequencies computed for the following subsets of the data:
    • Controls-only (no cases from common disease case/control studies)
    • Samples not assessed for a neurological phenotype
    • Samples that were not part of a cancer cohort
    • Samples that are not part of the Trans-Omics for Precision Medicine (TOPMed)-BRAVO dataset
  • New annotations for each variant
    • Filtering allele frequency using Poisson 95% and 99% CI, per population
    • Age histogram of heterozygous and homozygous carriers

gnomAD v2.1 comprises a total of 16mln SNVs and 1.2mln indels from 125,748 exomes, and 229mln SNVs and 33mln indels from 15,708 genomes. In addition to the 7 populations already present in gnomAD 2.0.2, this release now breaks down the non-Finnish Europeans and East Asian populations further into sub-populations. The population breakdown is detailed below.