We have released gnomAD v4.1, an update to our latest major release. This update fixes the allele number issue in gnomAD v4.0 previously…
gnomAD v4.1
April 19, 2024 in Announcements / Release
The news page highlights new features, versions, or other major announcements. See our changelog for all changes to gnomAD, including minor ones.
We have released gnomAD v4.1, an update to our latest major release. This update fixes the allele number issue in gnomAD v4.0 previously…
Today, we are delighted to announce the release of gnomAD v4, which includes data from 807,162 total individuals. This release is nearly 5x…
A critical component to the medical and functional interpretation of genetic variants involves the accurate estimation of their frequency. A…
By popular request, we are now releasing the genetic ancestry principal components analysis (PCA) variant loadings and accompanying random forest (RF) model used for genetic ancestry group inference in gnomAD v2 and v3. This post discusses how those files were generated and how they can be used on another dataset. However, the use of these resources will not be appropriate for all datasets, and therefore we are including a discussion of the caveats associated with using these loadings and the RF model.
Today, the gnomAD Production Team is proud to announce the release of gnomAD v3.1, an update to our previous genome release. The v3.1 data set adds 4,454 genomes, bringing the total to 76,156 whole genomes mapped to the GRCh38 reference sequence. (Our most recent exome release is available in gnomAD v2.1.)
Despite the minor numbering of this release, we bring you an update filled with firsts.
For the first time, we:
And we’re currently polishing up the final touches on our first-ever mitochondrial variant release on v3.1, which will be coming very soon.
Last month the gnomAD project was billed thousands of dollars in cloud egress charges—above and beyond our normal expected costs—for users who were accessing Hail-formatted public gnomAD data. The vast majority of this excess cost was due to users spinning up machines in international regions and reading data from our US-region storage bucket.
As a result, we have decided to move gnomAD Hail tables and matrix tables to a requester-pays bucket, while keeping the VCFs and smaller public files free to download as usual. We decided to do this for the following reasons:
From our beginnings as a project, we have been committed to making gnomAD data as free and accessible to the world as humanly possible. We pay for each VCF download of our data, and we have resisted proposals to add gating mechanisms (such as click-through agreements) to our data. We want to reaffirm our commitment to our users by continuing to make VCFs free to download to our growing user base.
However, to maintain gnomAD, we must keep costs as low as possible and fund aspects of gnomAD that benefit the widest user base. Providing free access to the Hail-formatted versions of the data is very costly and benefits only a small proportion of our user base—those running cloud pipelines on the data. Therefore, we have decided to require users to supply Google Cloud billing information when they access Hail versions of gnomAD.