InSiGHT specs 2.0.0

NOTE:
Before using these data, verify that the CSpec version numbers here match the latest version on the ClinGen CSpec registry.

These data are for research purposes only. While the ClinGen data are open to the public, users seeking information about a personal medical or genetic condition are urged to consult with a qualified physician for diagnosis and for answers to personal medical questions.

This track hub displays variant interpretation data from the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) Variant Curation Expert Panel (VCEP) for Lynch syndrome genes: MLH1, MSH2, MSH6, and PMS2.

The hub contains eight complementary tracks that assist with variant classification according to the InSiGHT VCEP specifications:

Description

Clinical Domains Track

This track displays clinically relevant functional domains for the four Lynch syndrome mismatch repair (MMR) genes as defined by the InSiGHT VCEP. While PM1 is not applicable to these genes, these domains are used in the PVS1 decision tree to determine whether a null variant affects a critical functional region.

Domains include:

HCI Priors Track

This track displays HCI (Huntsman Cancer Institute) prior probability predictions for missense variants. These computational predictions, based on MAPP and PolyPhen-2 scores, are used to apply the PP3 (computational evidence supports pathogenicity) or BP4 (computational evidence supports benign) criteria.

Allele Frequencies Track

This track displays ACMG/AMP allele frequency-based evidence codes derived from gnomAD v4.1 exome data. The frequency thresholds are gene-specific as defined by the InSiGHT VCEP specifications.

PVS1 Regions Track

This track displays the PVS1 decision tree regions for each gene, which determine the appropriate strength of the PVS1 criterion for null variants (nonsense, frameshift, canonical splice site, initiation codon, single/multi-exon deletions). The regions are based on nonsense-mediated decay (NMD) predictions and critical functional regions.

InSiGHT Curated Variants Track

This track displays variants in Lynch syndrome MMR genes that have been classified by the InSiGHT Variant Curation Expert Panel (VCEP) and submitted to ClinVar. All variants in this track have been "reviewed by expert panel" and represent the official InSiGHT VCEP classifications for Lynch syndrome genes.

Important note: This track specifically contains InSiGHT VCEP expert panel submissions to ClinVar, which represent the highest level of variant curation. These are distinct from other ClinVar submissions and reflect the consensus classification of the InSiGHT expert panel following ACMG/AMP guidelines with gene-specific modifications.

Note on gene symbols: A small number of variants (primarily large deletions and UTR variants) may display a neighboring gene symbol (e.g., FBXO11, AIMP2) instead of the expected MMR gene name. This occurs because ClinVar assigns gene symbols based on the genomic location of the variant, which may extend beyond the MMR gene boundaries. These variants are legitimate InSiGHT classifications for the MMR genes and the variant names correctly reference the MMR gene transcripts.

Functional Assays Track

This track displays functional assay evidence from four published studies on Lynch syndrome mismatch repair (MMR) genes. Functional assay results are used to apply the PS3 (functional studies supportive of a damaging effect) or BS3 (functional studies show no damaging effect) criteria at various evidence strengths. The track combines data from:

Each variant is classified according to ACMG PS3/BS3 criteria using established thresholds. For the CIMRA assays (Drost 2018, Drost 2020, Rath 2022), the Odds for Pathogenicity (OddsPath) score is classified using Tavtigian thresholds. For the deep mutational scanning data (Jia 2021), the loss-of-function (LOF) score is used. Variants that fall between pathogenic and benign thresholds are classified as Indeterminate.

PMS2 Pseudogene Caution Track

PMS2 has a closely related pseudogene, PMS2CL, located ~0.7 Mb centromeric on chromosome 7. PMS2CL shares ~98-99% sequence identity with PMS2 exons 9 and 11-15, which can lead to misalignment of short-read sequencing data and produce pseudogene-derived false-positive (or false-negative) PMS2 variant calls. This track highlights PMS2 exons with PMS2CL homology and the PMS2CL pseudogene region itself, alerting users that variants in these regions may benefit from additional orthogonal validation.

PMS2CL Paralog Variants Track

This track displays the curated PMS2 ClinVar variants from the "InSiGHT Curated Vars" track projected onto the equivalent positions on the PMS2CL pseudogene. It is intended to help analysts who detect a variant call on PMS2CL (from short-read NGS) recognize that it may actually correspond to a known classified PMS2 variant. Each item shows the PMS2CL n. notation, the equivalent PMS2 c. notation and classification, and the PMS2 genomic region for cross-reference.

Display Conventions and Configuration

Clinical Domains Track

Color Meaning
Fuchsia Clinically relevant functional domain

Mouseover displays: Domain name, Gene symbol, Transcript (NM accession), Amino acid location

HCI Priors Track

Color ACMG Code Meaning
Purple PP3_moderate HCI prior probability >0.81 (supports pathogenicity, moderate strength)
Light Purple PP3_supporting HCI prior probability 0.68-0.81 (supports pathogenicity, supporting strength)
Light Teal BP4_supporting HCI prior probability <0.11 (supports benign, supporting strength)

Mouseover displays: HGVSc notation, HGVSp (protein change), ACMG code, MAPP/PP2 Prior probability value, Classification rule

Allele Frequencies Track

Color ACMG Code Meaning
Purple PM2_supporting Absent or extremely rare (<1 in 50,000) in gnomAD v4.1
Teal BS1 Allele frequency too high for disorder (gene-specific thresholds; see Methods)
Dark Teal BA1 Stand-alone benign (gene-specific thresholds; see Methods)

Mouseover displays: HGVSc notation, ACMG code, Classification rule

PVS1 Regions Track

Color Region Type ACMG Code Meaning
Dark Red NMD PVS1 Nonsense-mediated decay region – null variants eligible for full-strength PVS1
Dark Red CritRegion PVS1 Critical functional region – null variants eligible for full-strength PVS1
Orange FuncUnknown PVS1_Moderate Function unknown region – null variants eligible for PVS1 at moderate strength
Gray PVS1_n.a. PVS1_n.a. PVS1 not applicable – region beyond functional relevance

Mouseover displays: Region name, Gene symbol, Codon position rule, ACMG code

InSiGHT Curated Variants Track

Color Classification Meaning
Red Pathogenic Variant is pathogenic for Lynch syndrome
Pink Likely pathogenic Variant is likely pathogenic for Lynch syndrome
Dark Blue Uncertain significance Variant of uncertain significance (VUS)
Lime Green Likely benign Variant is likely benign
Green Benign Variant is benign

Mouseover displays: Variant (HGVS notation), ClinVar ID with link, Classification, Date evaluated

Functional Assays Track

Color Classification Meaning
Dark Red PS3_Strong Strong evidence of pathogenicity (OddsPath >18.7 or LOF ≥0.4)
Red PS3_Moderate Moderate evidence of pathogenicity (OddsPath >4.3 and ≤18.7)
Pink PS3_Supporting Supporting evidence of pathogenicity (OddsPath >2.08 and ≤4.3)
Gray Indeterminate Score falls between pathogenic and benign thresholds (OddsPath >0.48 and ≤2.08, or LOF ≥0 and <0.4)
Lime Green BS3_Supporting Supporting evidence of benign effect (OddsPath >0.05 and ≤0.48)
Green BS3_Strong Strong evidence of benign effect (OddsPath ≤0.05 or LOF <0)

Mouseover displays: HGVSc/HGVSp notation, Protein change, Classification, ClinVar ID with link (if available), OddsPath or LOF score, Paper reference with PubMed link

PMS2 Pseudogene Caution Track

Color Caution Level Meaning
Red High PMS2 exon with ≥99% homology to PMS2CL (exons 11-15)
Amber Moderate PMS2 exon with ~98% homology to PMS2CL (exon 9)
Green Safe PMS2 exon with no PMS2CL homology (exons 1-8, 10)
Gray Pseudogene PMS2CL pseudogene region

Mouseover displays: Region name, Homology to PMS2CL, Caution level, and recommendation note

PMS2CL Paralog Variants Track

Items are colored by the classification of the equivalent PMS2 ClinVar variant, using the same color scheme as the InSiGHT Curated Variants track (red Pathogenic, pink Likely pathogenic, dark blue VUS, lime green Likely benign, green Benign).

Mouseover displays: PMS2CL n. notation, equivalent PMS2 c. notation, PMS2 classification, date evaluated, and the PMS2 genomic region (copy-paste to browser).

Methods

Clinical Domains

Protein domain boundaries were obtained from the InSiGHT VCEP specifications and mapped to genomic coordinates using the canonical transcripts:

HCI Priors

HCI prior probabilities were obtained from the LOVD database and mapped to genomic coordinates. Classification thresholds follow the InSiGHT VCEP specifications:

Allele Frequencies

Allele frequency data was obtained from gnomAD v4.1 exomes. The maximum population group allele frequency (grpmax AF) was used for classification according to InSiGHT VCEP thresholds:

PVS1 Regions

PVS1 decision tree regions were defined based on the InSiGHT VCEP specifications, incorporating nonsense-mediated decay (NMD) predictions based on the 50-55 nucleotide rule, critical functional regions in the 3' portion of genes, and gene-specific codon boundaries:

InSiGHT Curated Variants

Variant data was obtained from ClinVar via the NCBI E-utilities API. Only variants submitted by the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) with "reviewed by expert panel" status were included. This represents the official InSiGHT VCEP classifications for Lynch syndrome genes.

Data summary:

Classification breakdown:

Coordinate mapping:

Unmapped variants:

Approximately 150 variants (9%) could not be mapped to hg38 coordinates, and 159 variants (9%) could not be mapped to hg19 coordinates. These are primarily large structural variants (deletions, duplications, complex rearrangements) where ClinVar does not provide precise genomic coordinates.

Data updates:

This track is automatically updated every Tuesday from ClinVar to capture new InSiGHT VCEP submissions. Updates are validated against a 10% item count tolerance before going live. See the makedoc for details on the update pipeline.

Functional Assays

Functional assay data was extracted from the supplementary materials of four published studies. The data was mapped to genomic coordinates using the following canonical transcripts:

Variants with cDNA-level annotations (HGVSc) were mapped using the CDS position to determine single-nucleotide genomic coordinates. Variants with only protein-level annotations (HGVSp) were mapped to the 3-nucleotide codon span corresponding to the amino acid position. The name field uses the paper's original transcript version for traceability (e.g., NM_000249.3 for Drost 2018, NM_000249.4 for Rath 2022).

Classification thresholds:

For CIMRA assays (Drost 2018, Drost 2020, Rath 2022), the Odds for Pathogenicity (OddsPath) is classified using Tavtigian thresholds:

For the deep mutational scanning data (Jia 2021), the loss-of-function (LOF) score is classified as:

Data sources:

Data summary:

Classification breakdown:

See the makedoc for full build steps and the build scripts used to generate all tracks in this hub.

PMS2 Pseudogene Caution

PMS2CL is the dominant pseudogene of PMS2, located at chr7:6,735,304-6,751,601 (hg38) / chr7:6,774,935-6,791,232 (hg19). It arose from an inverted duplication of the 3' half of PMS2 and shares high sequence identity with PMS2 exons 9 and 11-15. Active gene conversion between the two paralogs can produce hybrid alleles, and short-read sequencing reads from these exons can be mismapped between PMS2 and PMS2CL. PMS2CL lacks PMS2 exon 10, which is therefore commonly used as a gene-specific anchor for long-range PCR-based confirmation methods.

Per-exon homology classifications in this track are drawn from the published literature on PMS2/PMS2CL paralogy (see References: Clendenning 2006, Hayward 2007, van der Klift 2010, Vaughn 2011). Exon coordinates are queried from NCBI RefSeq transcript NM_000535.7. The track also includes one entry for the PMS2CL pseudogene region itself.

PMS2CL Paralog Variants

Curated PMS2 variants from the InSiGHT Curated Vars track were projected onto PMS2CL coordinates using the cDNA alignment between NM_000535.7 and NR_002217.1. The two transcripts are essentially gap-free in the homologous region with a near-constant offset (PMS2 c. = PMS2CL n. + 1060, with slight adjustments for exon 9 and one 1-bp indel in PMS2 exon 11). Projected items are placed at the genomic location of the equivalent PMS2CL nucleotides. Intronic variants and variants spanning exon boundaries with intronic positions are skipped, as are variants in PMS2 exons 1-8, 10, or beyond c.2798 (which have no PMS2CL counterpart). The build is part of the weekly ClinVar update.

Data Access

The data underlying these tracks can be explored interactively using the UCSC Table Browser or downloaded from the track hub directory.

For the most current InSiGHT VCEP specifications, please visit the ClinGen CSpec Registry.

Credits

This track hub was created by the UCSC Genome Browser team in collaboration with the InSiGHT Variant Curation Expert Panel.

For questions about the data, please contact the InSiGHT VCEP coordinators.

References

Drost M, Tiersma Y, Thompson BA, Frederiksen JH, Keijzers G, Glubb D, Kathe S, Osinga J, Westers H, Pappas L et al. A functional assay-based procedure to classify mismatch repair gene variants in Lynch syndrome. Genet Med. 2019 Jul;21(7):1486-1496. PMID: 30504929; PMC: PMC7901556

Drost M, Tiersma Y, Glubb D, Kathe S, van Hees S, Calléja F, Zonneveld JBM, Boucher KM, Ramlal RPE, Thompson BA et al. Two integrated and highly predictive functional analysis-based procedures for the classification of MSH6 variants in Lynch syndrome. Genet Med. 2020 May;22(5):847-856. PMID: 31965077; PMC: PMC7200593

Jia X, Burugula BB, Chen V, Lemons RM, Jayakody S, Maksutova M, Kitzman JO. Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am J Hum Genet. 2021 Jan 7;108(1):163-175. PMID: 33357406; PMC: PMC7820803

Rath A, Radecki AA, Rahman K, Gilmore RB, Hudson JR, Cenci M, Tavtigian SV, Grady JP, Heinen CD. A calibrated cell-based functional assay to aid classification of MLH1 DNA mismatch repair gene variants. Hum Mutat. 2022 Dec;43(12):2295-2307. PMID: 36054288; PMC: PMC9772141

Clendenning M, Hampel H, LaJeunesse J, Lindblom A, Lockman J, Nilbert M, Senter L, Sotamaa K, de la Chapelle A. Long-range PCR facilitates the identification of PMS2-specific mutations. Hum Mutat. 2006 May;27(5):490-5. PMID: 16619239

Hayward BE, De Vos M, Talseth-Palmer BA, Meldrum CJ, Watson JE, Phillips SM, Bowman R, Scott RJ, Bonthron DT. Extensive gene conversion at the PMS2 DNA mismatch repair locus. Hum Mutat. 2007 Apr;28(4):424-30. PMID: 17094075

van der Klift HM, Tops CM, Bik EC, Boogaard MW, Borgstein AM, Hansson KB, Ausems MG, Gomez Garcia E, Hes FJ, Hofstra RM, et al. Quantification of sequence exchange events between PMS2 and PMS2CL provides a basis for improved mutation scanning of Lynch syndrome patients. Hum Mutat. 2010 May;31(5):578-87. PMID: 20186688

Vaughn CP, Robles J, Swensen JJ, Miller CE, Lyon E, Mao R, Bayrak-Toydemir P, Samowitz WS. Avoidance of pseudogene interference in the detection of 3' deletions in PMS2. Hum Mutat. 2011 Sep;32(9):1063-71. PMID: 21618646

Additional resources: