News

The news page highlights new features, versions, or other major announcements. See our changelog for all changes to gnomAD, including minor ones.

gnomAD 2024 user survey results

September 08, 2025 in Announcements

Katherine Chao, Samantha Baxter, Joshua Nadeau

Thank you to everyone who completed the 2024 gnomAD user survey. Your feedback is invaluable in helping us improve gnomAD for all users. In…

Mitochondrial Genome Constraint Metrics in gnomAD

February 26, 2025 in Announcements / Release

Nicole Lake

We’re excited to announce the addition of mitochondrial genome (mtDNA) constraint metrics to gnomAD, developed using data from over 56,000 individuals in gnomAD v4.1^1,2. These new metrics are designed to identify regions of the mtDNA under strong selective pressure that are thus most likely to harbor functionally important and disease-associated variants. The gnomAD browser now features mitochondrial gene constraint and regional constraint metrics.

gnomAD toolbox

January 27, 2025 in Announcements

Qin He, Julia Goodrich, Katherine Chao

We have released a new utility, the gnomAD toolbox, to enable easier analysis of gnomAD data. Community feedback has highlighted challenges…

Local Ancestry Inference for African/African American Samples in gnomAD

October 11, 2024 in Announcements / Release

Pragati Kore*, Michael Wilson*, Katherine Chao, Elizabeth Atkinson

Announcement

We have now released an extension of local ancestry-informed frequency work from the inferred Admixed American genetic ancestry group to the inferred African/African American genetic ancestry group (n=20,805) within gnomAD v4.0. This implementation of local ancestry inference (LAI) resolves the admixed ancestries of African/African American samples into their respective African and European haplotypes, leading to better-informed functional and/or clinical classifications.

GeniE, the Genetic Prevalence Estimator

June 04, 2024 in Announcements / Release

Samantha Baxter, Riley Grant, Josephine Lee, Nick Watts, Chan Zuckerberg Initiative Rare as One Network, Anne O'Donnell-Luria, Heidi Rehm

Overview

Today we announce the release of a new tool, the Genetic Prevalence Estimator (GeniE, https://genie.broadinstitute.org), which uses gnomAD allele frequencies to estimate the genetic prevalence of autosomal recessive diseases. This tool was developed in partnership with the Chan Zuckerberg Initiative Rare as One Network. By removing the need for computational expertise, GeniE makes estimating the genetic prevalence of rare recessive disease more accessible to the entire genomics community.

gnomAD v4.1

April 19, 2024 in Announcements / Release

Katherine Chao, Michael Wilson, Julia Goodrich, gnomAD Production Team

We have released gnomAD v4.1, an update to our latest major release. This update fixes the allele number issue in gnomAD v4.0 previously…

gnomAD v4.0 Gene Constraint

March 08, 2024 in Announcements

Katherine Chao, Kristen Laricchia, Julia Goodrich, Konrad Karczewski, Kaitlin Samocha

We updated our gene constraint metrics following the release of gnomAD v4.0. gnomAD v4.0 expanded the scale of our constraint calculations…

gnomAD v4.0

November 01, 2023 in Announcements / Release

Katherine Chao, gnomAD Production Team

Today, we are delighted to announce the release of gnomAD v4, which includes data from 807,162 total individuals. This release is nearly 5x…

Genetic Ancestry

November 01, 2023 in Announcements

Katherine Chao, gnomAD Production Team

A critical component to the medical and functional interpretation of genetic variants involves the accurate estimation of their frequency. A…

Structural Variants in gnomAD v4

November 01, 2023 in Announcements / Releases

Xuefang Zhao, Ryan Collins, Mark Walker, Phil Darnowsky, Katherine Chao, Harrison Brand, Michael Talkowski, gnomAD SV team

Today, we are thrilled to announce the release of genome-wide structural variants (SVs) for 63,046 unrelated samples with genome sequencing…

Rare coding CNVs from exome sequenced individuals in gnomAD v4

November 01, 2023 in Announcements / Releases

Jack Fu, Cal Liao, Ryan Collins, Lily Wang, Daniel Ben-Isvy, Harrison Brand, Michael Talkowski, Elissa Alarmani, gnomAD SV team

As a part of gnomAD V4, we are excited to include our first gnomAD release of rare (<1% overall site frequency) autosomal coding copy number variants (CNVs) from exome-sequencing (ES) in 464,297 individuals. These data are available to explore in the user-friendly gnomAD browser (https://gnomad.broadinstitute.org/), while the complete annotated rare CNV callset can be downloaded directly from the downloads page.

Advancing the AI/ML-Readiness of gnomAD Data with GA4GH Genomic Knowledge Standards

November 01, 2023 in Announcements

Alex Wagner, Wesley Goar, Kyle Ferriter, Daniel Marten, Kristen Laricchia, Katherine Chao, Larry Babb

Overview

We have expanded our representation of gnomAD v4 data to include data specifications from the Genomic Knowledge Standards Work…

Variant Co-occurrence Counts by Gene in gnomAD

March 14, 2023 in Announcements / Releases

Sarah Stenton, Phil Darnowsky, Kaitlin Samocha, Anne O’Donnell-Luria

Today we are pleased to announce the incorporation of cumulative counts of gnomAD individuals carrying pairs of rare co-occurring variants within genes in the gnomAD v2 browser, across various allele frequencies and functional consequences. These counts can be used to evaluate how frequently rare variant co-occurrence is observed in a large reference population. We envision that this data will aid the medical genetics community in interpreting the clinical significance of rare co-occurring variants found in patients, in the context of autosomal recessive disease. This feature builds off of our variant co-occurrence (inferred phasing) work (see “Variant Co-Occurrence (Phasing) Information in gnomAD”).

gnomAD Selected as Global Core Biodata Resource

December 15, 2022 in Announcements

Samantha Baxter

We are excited to announce that gnomAD has been included in the inaugural list of Global Core Biodata Resources (GCBR). GCBRs are selected…

The Addition of a Genomic Constraint Metric to gnomAD

October 24, 2022 in Announcements

Siwei Chen, Konrad Karczewski, Riley Grant

Overview

A genomic constraint metric is now available on the gnomAD browser. We quantify the depletion of variation (constraint) at a 1kb…

The Addition of Short Tandem Repeat Calls to gnomAD (v3.1.3)

January 21, 2022 in Announcements

Ben Weisburd, Grace VanNoy, Nick Watts

Overview

We ran ExpansionHunter [Dolzhenko 2019] on 18,511 whole genome samples from gnomAD v3.1 to generate calls for 60 disease…

Local Ancestry Inference for Latino/Admixed American Samples in gnomAD

December 01, 2021

Michael Wilson, Grace Tiao, Elizabeth Atkinson

We are happy to share, for the first time, local ancestry-informed frequency data for 14,804,207 bi-allelic SNPs within the Latino/Admixed American sample of gnomAD v3.1 (n=7,612). This initial release of gnomAD local ancestry inference (LAI) data contains estimated alternate allele counts, allele numbers, and allele frequencies partitioned by continental ancestries for variants in the Latino/Admixed American population. The samples classified as Latino/Admixed American by gnomAD’s global ancestry inference have highly heterogeneous, admixed ancestry,¹ and local ancestry inference resolves these individuals’ admixed ancestry into Amerindigenous, African, and European haplotypes.

gnomAD v3.1.2 minor release

October 22, 2021

Julia Goodrich, Katherine Chao, Mary Yohannes, Zan Koenig, Alicia Martin, Grace Tiao

In this minor release of gnomAD v3.1, we include the following improvements to the previous release: a fix to the homozygous alternate allele depletion adjustment that was made for v3.1 and several updates to the gnomAD v3.1 Human Genome Diversity Project (HGDP) and 1000 Genomes (1KG) subset release.

Using the gnomAD genetic ancestry principal components analysis loadings and random forest classifier on your dataset

October 15, 2021

Julia Goodrich, gnomAD Production Team

By popular request, we are now releasing the genetic ancestry principal components analysis (PCA) variant loadings and accompanying random forest (RF) model used for genetic ancestry group inference in gnomAD v2 and v3. This post discusses how those files were generated and how they can be used on another dataset. However, the use of these resources will not be appropriate for all datasets, and therefore we are including a discussion of the caveats associated with using these loadings and the RF model.

Variant Co-Occurrence (Phasing) Information in gnomAD

August 12, 2021 in Announcements / Releases

Michael Guo, Laurent Francioli, Nick Watts, Julia Goodrich

Today, we are pleased to announce the incorporation of variant co-occurrence (inferred phasing) information in the gnomAD v2 browser. Phase refers to the genetic relationship between a pair of variants; that is, whether the variants are on the same copy of the gene (cis) or on different copies of the gene (trans). We are releasing inferred phasing data for all pairs of variants within a gene where both variants have a global allele frequency in gnomAD exomes <5% and are either coding, flanking intronic (from position -1 to -3 in acceptor sites, and +1 to +8 in donor sites) or in the 5’/3’ UTRs. This encompasses 20,921,100 pairs of variants across 19,685 genes. We envision that this data will be of tremendous help to the medical genetics community in identifying and interpreting co-occurring variants in the context of recessive conditions.

gnomAD v3.1 Mitochondrial DNA Variants Manuscript

July 26, 2021 in Announcements

Sarah E. Calvo

We are happy to announce a manuscript that describes our pipeline to call mitochondrial DNA variants in gnomAD v3, released in November 202…

gnomAD v3.1 Mitochondrial DNA Variants

November 17, 2020 in Announcements / Releases

Kristen Laricchia, Sarah E. Calvo

Overview

Mitochondrial DNA (mtDNA) variants for gnomAD are now available for the first time! We have called mtDNA variants for 56,434 whole genome samples in the v3.1 release. This initial release includes population frequencies for 10,850 unique mtDNA variants defined at more than half of all mtDNA bases. The vast majority of variant calls (98%) are homoplasmic or near homoplasmic, whereas 2% are heteroplasmic. Variation in mitochondrial genomes contributes to many human diseases and has had unique value in the study of human evolutionary genetics. We hope that the addition of mtDNA to gnomAD will enable researchers to better understand the role of mtDNA variation in both health and disease states.

Previous gnomAD callsets have not included mtDNA variants because their properties do not fit the assumptions that we use with our nuclear variant calling pipeline. These properties include:

gnomAD v3.1 New Content, Methods, Annotations, and Data Availability

October 29, 2020 in Announcements / Releases

Grace Tiao, Julia Goodrich

We’re proud to announce the gnomAD v3.1 release of 759,302,267 short nuclear variants (644,267,978 passing variant quality filters) observed in 76,156 genome samples.

In this release, we have included more than 3,000 new samples specifically chosen to increase the ancestral diversity of the resource. As a result, this is the first release for which we have a designated population label for samples of Middle Eastern ancestry, and we are thrilled to be able to include these in the following population breakdown for the v3.1 release:

Population	Description	Genomes
afr	African/African American	20,744
ami	Amish	456
amr	Latino/Admixed American	7,647
asj	Ashkenazi Jewish	1,736
eas	East Asian	2,604
fin	Finnish	5,316
nfe	Non-Finnish European	34,029
mid	Middle Eastern	158
sas	South Asian	2,419
oth	Other (population not assigned)	1,047

gnomAD v3.1

October 29, 2020 in Announcements / Releases

gnomAD Production Team

Today, the gnomAD Production Team is proud to announce the release of gnomAD v3.1, an update to our previous genome release. The v3.1 data set adds 4,454 genomes, bringing the total to 76,156 whole genomes mapped to the GRCh38 reference sequence. (Our most recent exome release is available in gnomAD v2.1.)

Despite the minor numbering of this release, we bring you an update filled with firsts.

For the first time, we:

Provide individual genotypes in addition to variant calls for a subset of gnomAD. This highly diverse subset includes new data from >60 distinct populations from Africa, Europe, the Middle East, South and Central Asia, East Asia, Oceania, and the Americas
Provide and display data from samples of Middle Eastern ancestry
Display read data visualizations for non-coding variants—an effort that required the generation of visualizations for over 2.5 billion genotypes observed in this release
Display manual curations for predicted loss-of-function variants on the gnomAD browser
Generated the dataset by incrementally adding new samples onto an already-existing callset, eliminating the time and cost typically required to re-call existing samples
Make all gnomAD data—for this release as well as previous releases—freely available for download or export on three cloud providers: Amazon Web Services, Microsoft Azure, and Google Cloud

And we’re currently polishing up the final touches on our first-ever mitochondrial variant release on v3.1, which will be coming very soon.

Loss-of-Function Curations in gnomAD

October 29, 2020 in Announcements / Releases

Moriel Singer-Berk, Anne O'Donnell-Luria

Today we are pleased to announce the incorporation of manual loss-of-function (LoF) curations into the gnomAD v2.1.1 browser. As of this release, we have curated all homozygous pLoFs and a small set of recessive genes (e.g., GAA, GLA, IDUA, SMPD1, GBA, FIG4, MCOLN1, AP4B1, AP4M1, AP4S1, and AP4E1). These curations were performed for multiple projects including the recently published work, Karczewski et al. 2020 Nature, as well as other gene-specific projects. We are so excited to start sharing this data with you that we are including it in the gnomAD v3.1 release announcement but really these are a new gnomAD v2.1.1 feature at the moment. More datasets will be added to the browser as they are completed.

Open access to gnomAD data on multiple cloud providers

October 29, 2020 in Announcements

Grace Tiao

We’re very pleased to announce that gnomAD data is now available as a free public dataset on Amazon Web Services, Microsoft Azure, and Google Cloud. Researchers may download and read gnomAD data for free in all regions from all three cloud providers.

From our beginnings as a project, we have been committed to making gnomAD data as free and accessible to the world as possible. Working in partnership with Amazon, Microsoft, and Google’s public data hosting programs, we have expanded the number of cloud platforms on which gnomAD data is fully free to access. Researchers will no longer need to maintain personal copies of gnomAD data on these cloud platforms, eliminating long-term storage costs as well as transfer fees associated with copying gnomAD data into private cloud storage.

Requester-Pays Notice to Users

July 09, 2020 in Announcements

gnomAD Production Team

Last month the gnomAD project was billed thousands of dollars in cloud egress charges—above and beyond our normal expected costs—for users who were accessing Hail-formatted public gnomAD data. The vast majority of this excess cost was due to users spinning up machines in international regions and reading data from our US-region storage bucket.

As a result, we have decided to move gnomAD Hail tables and matrix tables to a requester-pays bucket, while keeping the VCFs and smaller public files free to download as usual. We decided to do this for the following reasons:

From our beginnings as a project, we have been committed to making gnomAD data as free and accessible to the world as humanly possible. We pay for each VCF download of our data, and we have resisted proposals to add gating mechanisms (such as click-through agreements) to our data. We want to reaffirm our commitment to our users by continuing to make VCFs free to download to our growing user base.
However, to maintain gnomAD, we must keep costs as low as possible and fund aspects of gnomAD that benefit the widest user base. Providing free access to the Hail-formatted versions of the data is very costly and benefits only a small proportion of our user base—those running cloud pipelines on the data. Therefore, we have decided to require users to supply Google Cloud billing information when they access Hail versions of gnomAD.

The gnomAD Papers

May 27, 2020 in Announcements

Daniel MacArthur

Originally published on the MacArthur Lab blog.

It is an absolute pleasure to announce the official release of the gnomAD manuscript package. In a set of seven papers, published in Nature, Nature Medicine, and Nature Communications, we describe a wide variety of different approaches to exploring and understanding the patterns of genetic variation revealed by exome and genome sequences from 141,456 humans.

Publication announcements always feel a little strange in this new era of open science. In our case, the underlying gnomAD data set has been publicly fully available for browsing and downloading since October 2016, and we’ve had the preprints available online since early 2019. However, it’s undeniable that there is something deeply gratifying about seeing these pieces of science revealed in their final, concrete form.

For me this package has a particular significance – it represents the culmination of seven and a half years of work with a phenomenal team at the Broad Institute, and marks my transition to a new role in Australia, and the handover of the gnomAD project to new leadership. So I wanted to spend some time in this post reflecting on the history of the project that became gnomAD, the people who’ve made it possible, and where things will go from here.

gnomAD v3.0

October 16, 2019 in Announcements / Releases

Laurent Francioli, Daniel MacArthur

Originally published on the MacArthur Lab blog.

We are thrilled to announce the release of gnomAD v3, a catalog containing 602M SNVs and 105M indels based on the whole-genome sequencing of 71,702 samples mapped to the GRCh38 build of the human reference genome. By increasing the number of whole genomes almost 5-fold from gnomAD v2.1, this release represents a massive leap in analysis power for anyone interested in non-coding regions of the genome or in coding regions poorly captured by exome sequencing.

In addition, gnomAD v3 adds new diversity – for instance, by almost doubling the number of African American samples we had in gnomAD v2 (exomes and genomes combined), and also including our first set of allele frequencies for the Amish population.

Structural variants in gnomAD

March 20, 2019 in Announcements / Releases

Ryan Collins, Harrison Brand, Daniel MacArthur, Mike Talkowski

Originally published on the MacArthur Lab blog.

The first gnomAD structural variant (SV) callset is now available via the gnomAD website and integrated directly into the gnomAD Browser.

This initial gnomAD SV callset includes nearly a half-million distinct SVs across seven SV mutational classes and 13 subclasses of complex SVs detected in 14,891 genomes spanning four major global populations. In the publicly released callset and gnomAD browser, you can find site, frequency, and annotation data for ~445k SVs from 10,738 unrelated genomes with appropriate consent to allow the release of this information.
In this post we summarize how we created this new call set, and some important practical considerations when using it. You can get more details, including callset generation and analyses, in the full gnomad-SV preprint available on bioRxiv.

gnomAD v2.1

October 17, 2018 in Announcements / Releases

Laurent Francioli, Grace Tiao, Konrad Karczewski, Matthew Solomonson, Nick Watts

Originally published on the MacArthur Lab blog.

We are delighted to announce the release of gnomAD v2.1! This new release of gnomAD is based on the same underlying callset as gnomAD v2.0.2, but has the following improvements and new features:

An awesome new browser
Per-gene loss-of-function constraint
Improved sample and variant filtering processes
Allele frequencies in sub-continental populations in Europe and East Asia
Allele frequencies computed for the following subsets of the data:
- Controls-only (no cases from common disease case/control studies)
- Samples not assessed for a neurological phenotype
- Samples that were not part of a cancer cohort
- Samples that are not part of the Trans-Omics for Precision Medicine (TOPMed)-BRAVO dataset
New annotations for each variant
- Filtering allele frequency using Poisson 95% and 99% CI, per population
- Age histogram of heterozygous and homozygous carriers

gnomAD v2.1 comprises a total of 16mln SNVs and 1.2mln indels from 125,748 exomes, and 229mln SNVs and 33mln indels from 15,708 genomes. In addition to the 7 populations already present in gnomAD 2.0.2, this release now breaks down the non-Finnish Europeans and East Asian populations further into sub-populations. The population breakdown is detailed below.

The genome Aggregation Database (gnomAD)

February 27, 2017 in Announcements / Releases

Konrad Karczewski, Laurent Francioli

Originally published on the MacArthur Lab blog.

Today, we are pleased to announce the formal release of the genome aggregation database (gnomAD). This release comprises two callsets: exome sequence data from 123,136 individuals and whole genome sequencing from 15,496 individuals. Importantly, in addition to an increased number of individuals of each of the populations in ExAC, we now additionally provide allele frequencies across over 5000 Ashkenazi Jewish (ASJ) individuals.