This track shows multiple alignments of 90 human genomes generated by the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments. This method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools.
In full and pack display modes, conservation scores are displayed as a wiggle track (histogram) in which the height reflects the size of the score. The conservation wiggles can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options.
Pairwise alignments of each species to the human genome are displayed below the conservation histogram as a grayscale density plot (in pack mode) or as a wiggle (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons.
Checkboxes on the track configuration page allow selection of the species to include in the pairwise display. Note that excluding species from the pairwise display does not alter the the conservation score display.
To view detailed information about the alignments at a specific position, zoom the display in to 30,000 or fewer bases, then click on the alignment.
The Display chains between alignments configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. The following conventions are used:
Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows:
When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the human sequence at those alignment positions relative to the longest non-human sequence. If there is sufficient space in the display, the size of the gap is shown. If the space is insufficient and the gap size is a multiple of 3, a "*" is displayed; other gap sizes are indicated by "+".
The MAF was obtained from the HPRC v1.0 minigraph-cactus HAL file (renamed to replace all "." characters in sample names with "#" using halRenameGenomes) using cactus v2.6.4 as follows.
cactus-hal2maf ./js ./hprc-v1.0-mc-grch38.h al hprc-v1.0-mc-grch38.maf.gz --noAncestors --refGenome GRCh38 --filterGapCausingDupes --chunkSize 100000 --batchCores 96 --batchCount 1 0 --noAncestors --batchParallelTaf 32 --batchSystem slurm --logFile hprc-v1.0-mc-grch38.maf.gz.log zcat hprc-v1.0-mc-grch38.maf.gz | mafDuplicateFilter -m - -k | bgzip > hprc-v1.0-mc-grch38-single-copy.maf.gz
Thank you to Glenn Hickey for providing the HAL file from the HPRC project.
Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ et al. A draft human pangenome reference. Nature. 2023 May;617(7960):312-324. DOI: 10.1038/s41586-023-05896-x; PMID: 37165242; PMC: PMC10172123
Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y, Human Pangenome Reference Consortium, Marschall T, Li H, Paten B. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol. 2023 May 10;. DOI: 10.1038/s41587-023-01793-w; PMID: 37165083; PMC: PMC10638906
Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, Fang Q, Xie D, Feng S, Stiller J et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020 Nov;587(7833):246-251. DOI: 10.1038/s41586-020-2871-y; PMID: 33177663; PMC: PMC7673649
Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D. Cactus: Algorithms for genome multiple sequence alignment. Genome Res. 2011 Sep;21(9):1512-28. DOI: 10.1101/gr.123356.111; PMID: 21665927; PMC: PMC3166836