To expand the number of publicly available control de novo variants, we have released 1,953 coding de novo variants (DNVs) called from 1,517 trios in the gnomAD v4.1 exomes.

We generated these calls by adapting Hail’s hl.de_novo method and Kaitlin Samocha’s de novo caller. We filtered variants to include only variants that were outside low-confidence regions, did not have a * alt allele, passed variant QC, had coding consequences, and passed gnomAD v4.1 exomes allele frequency and callset allele count filters. We additionally filtered to keep only variants with high and medium confidence of being true de novos, which resulted in a high-quality set of 1,953 coding DNVs. The observed de novo mutation rate per proband (~1.29 per exome) aligns with expected rates (Kaplanis & Samocha et al., Nature 2020).

We are continuing to refine our approach by integrating parent-specific de novo priors and better adjustments for false homozygous reference genotypes in the proband. We hope to release an updated dataset soon.

For more details on our de novo detection methods, visit our gnomad_qc GitHub repository and gnomad_methods GitHub repository.

To download the DNV dataset, visit our downloads page.