This track shows approximately 4.5 million single nucleotide variants (SNVs) and 0.6 million short insertions/deletions (indels) from 7 different parent/child trios as produced by the International Genome Sample Resource (IGSR), from sequence data generated by the 1000 Genomes Project in its Phase 3 sequencing of 2,504 genomes from 16 populations worldwide.
Variants were called on the autosomes (chromosomes 1 through 22) and on the Pseudo-Autosomal Regions (PARs) of chromosome X. Therefore this track has no annotations on alternate haplotype sequences, fix patches, chromosome Y, or the non-PAR portion (the majority) of chromosome X.
The variant genotypes have been phased (i.e., the two alleles of each diploid genotype have been assigned to two haplotypes, one inherited from each parent). This information allows us to illustrate which haplotypes in the child have been inherited from which parent.
Trios from six different populations are available, including:
This track illustrates the vcfPhasedTrio track type, where two lines, one for each chromosome in the diploid genome, is drawn per sample in the underlying VCF. Variants in the window are then drawn on the haplotype line corresponding to which haplotype they belong to, such that variants on the same line were likely inherited together. The sorting routine is the same as what is used to draw the haplotype sorted display in the non-trio 1000 Genomes track, and is described here.
The child haplotypes are drawn in the center of each group, flanked above and below by parent haplotypes, and variants are sorted to show the transmitted alleles:
parent 1 untransmitted haploytpe parent 1 transmitted haplotype child haplotype inherited from parent 1 child haplotype inherited from parent 2 parent 2 transmitted haplotype parent 2 untransmitted haploytpe
Track configuration options include:
Allele coloring options include:
From the subtrack configure menu, there is the option to manually rearrange the family order for each trio by dragging haplotypes.
Clicking on a variant takes one to a details page with the standard VCF details, including INFO column annotations, the REF and ALT alleles, and the genotypes from all three samples.
The genomes of 2,504 individuals were sequenced using both whole-genome sequencing (mean depth = 7.4x) and targeted exome sequencing (mean depth = 65.7x). Sequence reads were aligned to the reference genome using alt-aware BWA-MEM (Zheng-Bradley et al.). Variant discovery and quality control were performed as described in Lowy-Gallego et al.
See also:
Trio samples were extracted out of both the main 1000 Genomes set, and the related samples using the pedigree information from 1000 Genomes. Variants that were homozygous reference across all three samples were removed.
Trio VCFs are available for download from our download server.
Thanks to the International Genome Sample Resource (IGSR) for making these variant calls freely available.
Zheng-Bradley X, Streeter I, Fairley S, Richardson D, Clarke L, Flicek P, 1000 Genomes Project Consortium. Alignment of 1000 Genomes Project reads to reference assembly GRCh38. Gigascience. 2017 Jul 1;6(7):1-8. PMID: 28531267; PMC: PMC5522380
Fairley S, Lowy-Gallego E, Perry E, Flicek P. The International Genome Sample Resource (IGSR) collection of open human genomic variation resources. Nucleic Acids Res. 2019 Oct 4. PMID: 31584097
Lowy-Gallego E, Fairley S, Zheng-Bradley X, Ruffier M, Clarke L, Flicek P, 1000 Genomes Project Consortium. Variant calling on the GRCh38 assembly with the data from phase three of the 1000 Genomes Project [version 1; peer review: 2 not approved]. Wellcome Open Research. 2019 Mar. 11.
1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA et al. A global reference for human genetic variation. Nature. 2015 Oct 1;526(7571):68-74. PMID: 26432245