The Consortium of Long Read Sequencing (CoLoRSdb) catalogs both small genetic variants (single nucleotide polymorphisms and short insertions and deletions) and structural variants (insertions, deletions, and inversions) discovered from long-read whole genome sequencing. Long reads improve sensitivity and breakpoint resolution compared to short-read data, especially for repeats and complex loci.
For more information, see the CoLoRSdb website: https://colorsdb.org/
The CoLoRSdb track can be explored interactively using the REST API, the Table Browser or the Data Integrator. The VCF and bigBed files files are available from our downloads directory. The original VCF files are part of the v1.2.0 dataset, and documentation is available on Zenodo: https://zenodo.org/records/14814308
Thanks to Mike Schatz, Evan Eichler, and all CoLoRSdb investigators for generating and making the data publicly available.
Poplin R, Chang PC, Alexander D, Schwartz S, Colthurst T, Ku A, Newburger D, Dijamco J, Nguyen N, Afshar PT et al. A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. 2018 Nov;36(10):983-987. DOI: 10.1038/nbt.4235; PMID: 30247488
Yun T, Li H, Chang PC, Lin MF, Carroll A, McLean CY. Accurate, scalable cohort variant calls using DeepVariant and GLnexus. Bioinformatics. 2021 Apr 5;36(24):5582-5589. DOI: 10.1093/bioinformatics/btaa1081; PMID: 33399819; PMC: PMC8023681
Kirsche M, Prabhu G, Sherman R, Ni B, Battle A, Aganezov S, Schatz MC. Jasmine and Iris: population-scale structural variant comparison and analysis. Nat Methods. 2023 Mar;20(3):408-417. DOI: 10.1038/s41592-022-01753-3; PMID: 36658279; PMC: PMC10006329
Eisfeldt J, Ameur A, Lenner F, Ten Berk de Boer E, Ek M, Wincent J, Vaz R, Ottosson J, Jonson T, Ivarsson S et al. A national long-read sequencing study on chromosomal rearrangements uncovers hidden complexities. Genome Res. 2024 Nov 20;34(11):1774-1784. DOI: 10.1101/gr.279510.124; PMID: 39472022; PMC: PMC11610602
Lake, J. A., & Consortium of Long Read Sequencing (CoLoRS). Consortium of Long Read Sequencing Database (CoLoRSdb) (v1.2.0) [Data set] Zendo. 2025 Feb 5. 10.5281/zenodo.14814308