Ensembl genes annotations of the HPRC assemblies, version: 2022_08
on the 11 Jun 2021 Homo sapiens/GCA_018852595.1_HG02145.alt.pat.f1_v2 genome assembly.
Gene count: 227,332; Bases covered: 1,668,997,763 (145,137,615 bases in exons only)
Ensembl annotation of the human assemblies has been produced via a new
mapping pipeline:
A subset of the GENCODE 38 genes and transcripts have been annotated on each
of the haploid assemblies. The subset excludes readthrough genes and genes
on patches or haplotypes. For each gene, anchor sequences built from the
surrounding region were used to locate the most likely corresponding
region(s) in the target genome. A pairwise alignment of the reference and
target regions was then carried out and used to map the exon coordinates and
other features of the gene. In addition to the primary mapping, potential
recent duplications and collapsed paralogues were identified by aligning
canonical transcripts across the entire genome and searching for new
mappings that did not overlap existing annotations. For more details on the
annotation process, please refer to the
preprint publication
(see "Methods" section: "Ensembl Mapping Pipeline for Assembly Annotation").
Ensembl Human Pangenome Reference Consortium: https://projects.ensembl.org/hprc/
The bigGenePred file in this assembly hub can be obtained from: https://hgdownload.soe.ucsc.edu/hubs/GCA/018/852/595/GCA_018852595.1/bbi/GCA_018852595.1_HG02145.alt.pat.f1_v2.ebiGene.bb
A Draft Human Pangenome Reference
Wen-Wei Liao, Mobin Asri, Jana Ebler, Daniel Doerr, Marina Haukness,
Glenn Hickey, Shuangjia Lu, Julian K. Lucas, Jean Monlong, Haley J. Abel,
Silvia Buonaiuto, Xian H. Chang, Haoyu Cheng, Justin Chu, Vincenza Colonna,
Jordan M. Eizenga, Xiaowen Feng, Christian Fischer, Robert S. Fulton,
Shilpa Garg, Cristian Groza, Andrea Guarracino, William T Harvey,
Simon Heumos, Kerstin Howe, Miten Jain, Tsung-Yu Lu, Charles Markello,
Fergal J. Martin, Matthew W. Mitchell, Katherine M. Munson,
Moses Njagi Mwaniki, Adam M. Novak, Hugh E. Olsen, Trevor Pesout,
David Porubsky, Pjotr Prins, Jonas A. Sibbesen, Chad Tomlinson,
Flavia Villani, Mitchell R. Vollger, Human Pangenome Reference Consortium,
Guillaume Bourque, Mark JP Chaisson, Paul Flicek, Adam M. Phillippy,
Justin M. Zook, Evan E. Eichler, David Haussler, Erich D. Jarvis,
Karen H. Miga, Ting Wang, Erik Garrison, Tobias Marschall, Ira Hall,
Heng Li, Benedict Paten
bioRxiv: 2022.07.09.499321;
doi: https://doi.org/10.1101/2022.07.09.499321>