cytoBand Chromosome Band bed 4 + Chromosome Bands Localized by FISH Mapping Clones 0 1 0 0 0 127 127 127 0 0 0
\ The chromosome band track represents the approximate \ location of bands seen on Giemsa-stained chromosomes.\ Chromosomes are displayed in the browser with the short arm first. \ Cytologically identified bands on the chromosome are numbered outward \ from the centromere on the short (p) and long (q) arms. At low resolution, \ bands are classified using the nomenclature \ [chromosome][arm][band], where band is a \ single digit. Examples of bands on chromosome 3 include 3p2, 3p1, cen, 3q1, \ and 3q2. At a finer resolution, some of the bands are subdivided into \ sub-bands, adding a second digit to the band number, e.g. 3p26. This \ resolution produces about 500 bands. A final subdivision into a \ total of 862 sub-bands is made by adding a period and another digit to the \ band, resulting in 3p26.3, 3p26.2, etc.
\ \\ A full description of the method by which the chromosome band locations are \ estimated can be found in Furey, T.S., and Haussler, D.\ Integration of the cytogenetic map with the draft human genome \ sequence, Hum. Mol. Gen., 12(9), 1037-1044 (2003).
\\ Barbara Trask, Vivian Cheung, Norma Nowak and others in the BAC Resource\ Consortium used fluorescent in-situ hybridization (FISH) to determine a \ cytogenetic location for large genomic clones on the chromosomes.\ The results from these experiments are the primary source of information used\ in estimating the chromosome band locations.\ For more information about the BAC Resource Consortium, see \ Integration of cytogenetic landmarks into the draft sequence of\ the human genome, Nature, 409, 953-958 (2001) and the \ accompanying web site,\ Human BAC Resource.
\\ BAC clone placements in the human sequence are determined at UCSC using a \ combination of full BAC clone sequence, BAC end sequence, and STS marker \ information.
\ \\ We would like to thank all the labs that have contributed to this resource:\
\ This track shows locations of Sequence Tagged Site (STS) markers\ along the draft assembly. These markers have been mapped using \ either genetic mapping (Genethon, Marshfield, and deCODE maps),\ radiation hybridization mapping (Stanford, Whitehead RH, and GeneMap99 maps) \ or YAC mapping (the Whitehead YAC map) techniques.
\\ Genetic map markers are shown in blue; radiation hybrid map markers are shown \ in black. When a marker maps to multiple positions in the genome, it is \ displayed in a lighter color.
\ \\ This track has a filter that can be used to change the color or \ include/exclude the display of a dataset from an individual lab. This is \ helpful when many items are shown in the track display, especially when only \ some are relevant to the current task. The filter is located at the top of \ the track description page, which is accessed via the small button to the \ left of the track's graphical display or through the link on the track's \ control menu. To use the filter:\
\ When you have finished configuring the filter, click the Submit \ button.
\ \\ Many thanks to the researchers who worked on these\ maps, and to Greg Schuler, Arek Kasprzyk, Wonhee Jang,\ Terry Furey and Sanja Rogic for helping\ process the data. Additional data on the individual maps can be\ found at the following links:\
\ This track shows the location of fluorescent in situ hybridization \ (FISH)-mapped clones along the draft assembly sequence. The locations of \ these clones were contributed as a part of the BAC Consortium paper \ Cheung, V.G. et al. (2001) in the References section below.
\\ More information about the BAC clones, including how they may be obtained, \ can be found at the \ Human BAC Resource and the \ Clone Registry web sites hosted by \ NCBI.\ To view Clone Registry information for a clone, click on the clone name at \ the top of the details page for that item.
\ \\ This track has a filter that can be used to change the color or \ include/exclude the display of a dataset from an individual lab. This is \ helpful when many items are shown in the track display, especially when only \ some are relevant to the current task. The filter is located at the top of \ the track description page, which is accessed via the small button to the \ left of the track's graphical display or through the link on the track's \ control menu. To use the filter:\
\ When you have finished configuring the filter, click the Submit \ button.
\ \\ We would like to thank all of the labs that have contributed to this resource:\
\ Cheung, V.G. et al.. \ Integration of cytogenetic landmarks into the draft sequence of \ the human genome, Nature 409, 953-958 (2001).
\ map 1 recombRate Recomb Rate bed 4 + Recombination Rate from deCODE, Marshfield, or Genethon Maps (deCODE default) 0 8 0 0 0 127 127 127 0 0 0\ The recombination rate track represents\ calculated sex-averaged rates of recombination based on either the\ deCODE, Marshfield, or Genethon genetic maps. By default, the deCODE\ map rates are displayed. Female- and male-specific recombination\ rates, as well as rates from the Marshfield and Genethon maps, can\ also be displayed by choosing the appropriate filter option on the track \ description page.
\ \\ The deCODE genetic map was created at \ deCODE Genetics and is \ based on 5,136 microsatellite markers for 146 families with a total\ of 1,257 meiotic events. For more information on this map, see\ Kong, A. et al. (2002) in the References section below.
\\ The Marshfield genetic map was created at the \ Center for Medical Genetics and is based on 8,325 short \ tandem repeat polymorphisms (STRPs) for 8 CEPH families consisting of 134\ individuals with 186 meioses. For more information on this map, see \ Broman, K.W. et al. 1998 in the References section below.
\\ The Genethon genetic map was created at \ Genethon and is based on 5,264 microsatellites for 8 CEPH \ families consisting of 134 individuals with 186 meioses. For more information \ on this map, see \ Dib et al. 1996 in the References section below.
\\ Each base is assigned the recombination rate calculated by\ assuming a linear genetic distance across the immediately flanking\ genetic markers. The recombination rate assigned to each 1 Mb window\ is the average recombination rate of the bases contained within the\ window.
\ \\ This track has a filter that can be used to change the map or\ gender-specific rate displayed. The filter is located at the top of the track \ description page, which is accessed via the small button to the left of \ the track's graphical display or through the link on the track's control menu.\ To view a particular map or gender-specific rate, select the corresponding\ option from the "Map Distances" pulldown list. By default, the \ browser displays the deCODE sex-averaged distances.
\\ When you have finished configuring the filter, click the Submit \ button.
\ \\ This track was produced at UCSC using data that are freely available for\ the Genethon, Marshfield, and deCODE genetic maps (see above links). Thanks\ to all who played a part in the creation of these maps.
\ \\ Broman, K.W., Murray, J.C., Sheffield, V.C., White, R.L. and Weber, J.L.\ Comprehensive human genetic maps: Individual and sex-specific \ variation in recombination, American Journal of Human Genetics\ 63, 861-689 (1998).
\\ Dib, C., Faure, S., Fizames, C., Samson, D., Drouot, N., Vignal, A., \ Millasseau, P., Marc, S., Hazan, J., Seboun, E., Lathrop, M., Gyapay, G., \ Morissette, J., and Weissenbach, J. \ A comprehensive genetic map of the human genome based on 5,264 \ microsatellites, \ Nature 380(6570), 152-154 (1996).
\\ Kong, A., Gudbjartsson, D.F., Sainz, J., Jonsdottir, G.M., Gudjonsson, S.A., \ Richardsson, B., Sigurdardottir, S., Barnard, J., Hallbeck, B., Masson, G., \ Shlien, A., Palsson, S.T., Frigge, M.L., Thorgeirsson, T.E., Gulcher, J.R., \ and Stefansson, K.\ A high-resolution recombination map of the human genome,\ Nature Genetics, 31(3), 241-247 (2002).
\ map 1 exonArrows off\ ctgPos Map Contigs ctgPos Physical Map Contigs 0 9 150 0 0 202 127 127 0 0 0\ This track shows the locations of $organism contigs on the physical map. \ The underlying data is derived from the NCBI seq_contig.md file \ that accompanies this assembly. All contigs are "+" oriented in\ the assembly.
\ \\ For $organism genome reference sequences dated April 2003 and later,\ the individual chromosome sequencing centers are responsible\ for preparing the assembly of their chromosomes in \ AGP format. The\ files provided by these centers are checked and validated at NCBI, and\ form the basis for the seq_contig.md file that defines the physical \ map contigs.
\\ For more information on the human genome assembly process, see \ The NCBI Handbook.
\ map 0 gold Assembly bed 3 + Assembly from Fragments 0 10 150 100 30 230 170 40 0 0 0This track shows the draft assembly of the $organism genome.\ This assembly merges contigs from overlapping drafts and\ finished clones into longer sequence contigs. The sequence\ contigs are ordered and oriented when possible by mRNA, EST,\ paired plasmid reads (from the SNP Consortium) and BAC end\ sequence pairs.
\In dense mode, this track depicts the path through the draft and \ finished clones (aka the golden path) used to create the assembled sequence. \ Clone boundaries are distinguished by the use of alternating gold and brown \ coloration. Where gaps\ exist in the path, spaces are shown between the gold and brown\ blocks. If the relative order and orientation of the contigs\ between the two blocks is known, a line is drawn to bridge the\ blocks.
\\ Clone Type Key:\
\ This track depicts gaps in the assembly. Most of these gaps - with the\ exception of intractable heterochromatic, centromeric, telomeric, and short-arm \ gaps - have been closed during the finishing process, although a small number \ still remain. \
\ Gaps are represented as black boxes in this track.\ If the relative order and orientation of the contigs on either side\ of the gap is known, it is a bridged gap. In this case, a white line is \ drawn through the black box representing the gap and the gap is labeled \ "yes". \
\This assembly contains the following types of gaps:\
\ In dense display mode, this track shows the coverage level of \ the genome. Finished regions are depicted in black. Draft regions \ are shown in various shades of gray that correspond\ to the level of coverage. \
\ In full display mode, this track shows the position of each clone that aligns\ to the genome sequence. Finished clones are depicted in black, and unfinished\ clones are colored gray. NOTE: Fragment positions in unfinished clones are no \ longer delineated.\
\ map 0 bacEndPairs BAC End Pairs bed 6 + BAC End Pairs 0 15 0 0 0 127 127 127 0 0 0\ Bacterial artificial chromosomes (BACs) are a key part of many \ large-scale sequencing projects. A BAC typically consists of 50 - 600 kb of\ DNA. During the early phase of a sequencing project, it is common\ to sequence a single read (approximately 500 bases) off each end of\ a large number of BACs. Later on in the project, these BAC end reads\ can be mapped to the genome sequence.
\\ This track shows these mappings\ in cases where both ends could be mapped. These BAC end pairs can\ be useful for validating the assembly over relatively long ranges. In some\ cases, the BACs are useful biological reagents. This track can also be\ used for determining which BAC contains a given gene, useful information\ for certain wet lab experiments.
\\ A valid pair of BAC end sequences must be\ at least 50 kb but no more than 600 kb away from each other. \ The orientation of the first BAC end sequence must be "+" and\ the orientation of the second BAC end sequence must be "-".
\\ The scoring scheme used for this annotation assigns 1000 to an alignment \ when the BAC end pair aligns to only one location in the genome (after \ filtering). When a BAC end pair or clone aligns to multiple locations, the \ score is calculated as 1500/(number of alignments).
\ \\ BAC end sequences are placed on the assembled sequence using\ Jim Kent's blat program.
\ \\ Additional information about the clone, including how it\ can be obtained, may be found at the \ NCBI Clone Registry. To view the registry entry for a \ specific clone, open the details page for the clone and click on its name at \ the top of the page.
\ map 1 exonArrows off\ fosEndPairs Fosmid End Pairs bed 6 + Fosmid End Pairs 0 18 0 0 0 90 90 90 0 0 0A valid pair of fosmid end sequences must be\ at least 30 kb but no more than 50 kb away from each other. \ The orientation of the first fosmid end sequence must be "+" and\ the orientation of the second fosmid end sequence must be "-".
\ \End sequences were trimmed at the NCBI using\ ssahaCLIP written by Jim Mullikin. Trimmed fosmid end sequences were\ placed on the assembled sequence using Jim Kent's \ blat \ program.
\ \Sequencing of the fosmid ends was done at the \ Eli & Edythe L. Broad\ Institute of MIT and Harvard University. Clones are available through the\ BACPAC Resources\ Center at Children's Hospital Oakland Research Institute (CHORI).\
\ map 1 exonArrows off\ gcPercent GC Percent bed 4 + Percentage GC in 20,000-Base Windows 0 23 0 0 0 127 127 127 1 0 0\ The GC percent track shows the percentage of G (guanine) and C (cytosine) bases\ in a 20,000 base window. Windows with high GC content are drawn more darkly \ than windows with low GC content. High GC content is typically associated with \ gene-rich areas.\
\\ This track was generated at UCSC.\ map 1 knownGene Known Genes genePred knownGenePep knownGeneMrna Known Genes Based on SWISS-PROT, TrEMBL, mRNA, and RefSeq 3 34 12 12 120 133 133 187 0 0 0
\ The UCSC Known Genes track shows known protein-coding genes based on \ protein data from SWISS-PROT, TrEMBL, and TrEMBL-NEW and their\ corresponding mRNAs from \ GenBank.
\ \\ This track follows the display conventions for\ gene prediction\ tracks. Black coloring indicates features that have corresponding entries\ in the Protein Databank (PDB). Blue indicates features associated with\ mRNAs from NCBI RefSeq or (dark blue) items having associated proteins in\ the SWISS-PROT database. The variation in blue shading of RefSeq items\ corresponds to the level of review the RefSeq record has undergone:\ predicted (light), provisional (medium), or reviewed (dark).
\\ This track contains an optional codon coloring\ feature that allows users to quickly validate and compare gene predictions.\ To display codon colors, select the genomic codons option from the\ Color track by codons pull-down menu. Click\ here for more\ information about this feature.
\ \\ mRNA sequences were aligned against the $organism genome using blat. When a \ single mRNA aligned in multiple places, only alignments having at least 98% \ base identity with the genomic sequence were kept. This set of mRNA \ alignments was further reduced by keeping only those mRNAs referenced by a \ protein in SWISS-PROT, TrEMBL, or TrEMBL-NEW.
\\ Among multiple mRNAs referenced by a single protein, the best mRNA was \ selected, based on a quality score derived from its length, the level of the\ match between its translation and the protein sequence, and its release date.\ The resulting mRNA and protein pairs were further filtered by removing \ short invalid entries and consolidating entries with identical CDS regions.\
\\ Finally, RefSeq entries derived from DNA sequences instead of \ mRNA sequences were added to produce the final data set shown in this track. \ Disease annotations were obtained from SWISS-PROT.
\ \\ The Known Genes track was produced at UCSC based primarily on cross-references\ between proteins from \ SWISS-PROT \ (including TrEMBL and TrEMBL-NEW) and mRNAs from \ GenBank\ contributed by scientists worldwide. \ NCBI RefSeq \ data were also included in this track.
\ \\ The UniProt data have the following terms of use, UniProt copyright(c) 2002 - \ 2004 UniProt consortium:
\\ For non-commercial use, all databases and documents in the UniProt FTP\ directory may be copied and redistributed freely, without advance\ permission, provided that this copyright statement is reproduced with\ each copy.
\\ For commercial use, all databases and documents in the UniProt FTP\ directory except the files\
\ From January 1, 2005, all databases and documents in the UniProt FTP\ directory may be copied and redistributed freely by all entities,\ without advance permission, provided that this copyright statement is\ reproduced with each copy.
\ \\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J,\ Wheeler DL.\ GenBank: update.\ Nucleic Acids Res. 2004 Jan 1;32:D23-6.
\\ Hsu F, Kent WJ, Clawson H, Kuhn RM, Diekhans M, Haussler D.\ The UCSC Known Genes.\ Bioinformatics. 2006 May 1;22(9):1036-46.
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ genes 1 baseColorDefault genomicCodons\ baseColorUseCds given\ directUrl /cgi-bin/hgGene?hgg_gene=%s&hgg_chrom=%s&hgg_start=%d&hgg_end=%d&hgg_type=%s&db=%s\ hgGene on\ hgsid on\ idXref kgAlias kgID alias\ refGene RefSeq Genes genePred refPep refMrna RefSeq Genes 1 35 12 12 120 133 133 187 0 0 0\ The RefSeq Genes track shows known protein-coding genes taken from \ the NCBI mRNA reference sequences collection (RefSeq). On assemblies in \ which incremental GenBank downloads are supported, the data underlying this \ track are updated nightly.
\ \\ This track follows the display conventions for \ gene prediction \ tracks.\ The color shading indicates the level of review the RefSeq record has \ undergone: predicted (light), provisional (medium), reviewed (dark). \ In some assemblies, non-coding RNA genes are shown in a separate track.
\\ The item labels and display colors of features within this track can be\ configured through the controls at the top of the track description page. \ This page is accessed via the small button to the left of the track's \ graphical display or through the link on the track's control menu. \
\ RefSeq mRNAs were aligned against the $organism genome using blat; those\ with an alignment of less than 15% were discarded. When a single mRNA \ aligned in multiple places, the alignment having the highest base identity \ was identified. Only alignments having a base identity level within 0.1% of \ the best and at least 96% base identity with the genomic sequence were kept.\
\ \ \\ This track was produced at UCSC from mRNA sequence data\ generated by scientists worldwide and curated by the \ NCBI RefSeq project.
\ \\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ \Pruitt KD, Tatusova T, Maglott DR. \ NCBI Reference Sequence (RefSeq): a curated non-redundant \ sequence database of genomes, transcripts and proteins. Nucleic Acids \ Res. 2005 Jan 1;33(Database issue):D501-4.\
\ genes 1 baseColorUseCds given\ idXref refLink mrnaAcc name\ vegaGene Vega Genes genePred vegaPep Vega Annotations 0 37 0 100 180 127 177 217 0 0 3 chr14,chr20,chr22, http://vega.sanger.ac.uk/Homo_sapiens/geneview?transcript=$$\ This track shows gene annotations from the Vertebrate Genome Annotation (Vega)\ database.
\\ The following information is excerpted from the\ Vertebrate Genome Annotation\ home page:
\\ "The Vega database\ is designed to be a central repository for high-quality, frequently updated\ manual annotation of different vertebrate finished genome sequence.\ Vega attempts to present consistent high-quality curation of the published\ chromosome sequences. Finished genomic sequence is analysed on a\ clone-by-clone basis using\ a combination of similarity searches against DNA and protein databases\ as well as a series of ab initio gene predictions (GENSCAN, Fgenes).\ The annotation is based on supporting evidence only."
\\ "In addition, comparative analysis using vertebrate datasets such as\ the Riken mouse cDNAs and Genoscope Tetraodon nigroviridis Ecores\ (Evolutionary Conserved Regions) are used for novel gene discovery."
\\ NOTE: VEGA annotations do not appear on every chromosome in this assembly.
\ \\ This track follows the display conventions for\ gene prediction\ tracks using the following color scheme to indicate the status of the gene\ annotation:\
\ The details pages show the only the Vega gene type and not the transcript type.\ A single gene can have more than one transcript which can belong to\ different classes, so the gene as a whole is classified according to the\ transcript with the "highest" level of classification. Transcript\ type (and other details) may be found by clicking on the transcript\ identifier which forms the outside link to the Vega transcript details page.\ Further information on the gene and transcript classification may be found\ here.\
\ \\ Thanks to Steve Searle at the\ Sanger Institute \ for providing the GTF and FASTA files for the Vega annotations. Vega gene annotations are \ generated by manual annotation from the following groups:\
\
Chromosome 6:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Mungall AJ et al.,\
The DNA sequence and analysis of human \
\ chromosome 6. \
Nature. 2003 Oct 23;425:805-11.
\
Chromosome 7:\
\ Hillier et al., \
\ University of Washington Genome Center
\
\ Relevant publication: Hillier LW et al., \
The DNA sequence of human \
\ chromosome 7. \
Nature. 2003 Jul 10;424:157-64.
\
Chromosome 9:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
Relevant publication: Humphray SJ et al., \
The DNA sequence and analysis of human chromosome 9. \
Nature. 2004 May 27;429;369-74.
\
Chromosome 10:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Deloukas P et al., \
The DNA sequence and comparative analysis of human chromosome 10. \
Nature. 2004 May 27;429:375-81.
\
Chromosome 13:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Dunham A et al., \
The DNA sequence and analysis of human chromosome 13. \
Nature. 2001 Apr 1;428:522-8.
\
Chromosome 14: \
\ \
\ Genoscope
\
\ Relevant publication: Heilig R et al., \
The DNA sequence and analysis of \
\ human chromosome 14. \
Nature. 2003 Feb 6;421:601-7.
\
Chromosome 20: \
\ The HAVANA Group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Deloukas P et al., \
The DNA sequence and \
\ comparative analysis of human chromosome 20. \
Nature. 2001 Dec 20;414:865-71.
\
Chromosome 22: Chromosome 22 Group,\
\ \
\ Wellcome Trust Sanger Institute
\
\ Relevant publications:
\
\ — Collins JE et al., \
Reevaluating Human Gene Annotation: \
\ A Second-Generation Analysis of Chromosome 22. \
Genome Research. 2003 Jan;13(1):27-36.
\
\ — Dawson E et al., \
A \
\ first-generation linkage disequilibrium map of human chromosome 22. \
Nature. 2002 Aug 1;418:544-8.
\
\ — Dunham I, et al., \
The DNA sequence of human chromosome 22. \
Nature. 1999 Dec 2;402:489-95.
\
Chromosome X: \
\ The HAVANA Group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Ross MT et al., \
The DNA sequence and \
\ comparative analysis of human chromosome X. \
Nature 2005 Mar 17;434:325-37.
\ This track shows pseudogene annotations from the Vertebrate Genome Annotation \ (Vega) database.
\\ The following information is excerpted from the\ Vertebrate Genome Annotation\ home page:
\\ "The Vega database\ is designed to be a central repository for high-quality, frequently updated\ manual annotation of different vertebrate finished genome sequence.\ Vega attempts to present consistent high-quality curation of the published\ chromosome sequences. Finished genomic sequence is analysed on a\ clone-by-clone basis using\ a combination of similarity searches against DNA and protein databases\ as well as a series of ab initio gene predictions (GENSCAN, Fgenes).\ The annotation is based on supporting evidence only."
\\ "In addition, comparative analysis using vertebrate datasets such as\ the Riken mouse cDNAs and Genoscope Tetraodon nigroviridis Ecores\ (Evolutionary Conserved Regions) are used for novel gene discovery."
\\ NOTE: VEGA annotations do not appear on every chromosome in this assembly.
\ \\ This track follows the display conventions for\ gene prediction\ tracks using the following color scheme to indicate the status of the gene\ annotation:\
\ The details pages show the only the Vega gene type and not the transcript type.\ A single gene can have more than one transcript which can belong to\ different classes, so the gene as a whole is classified according to the\ transcript with the "highest" level of classification. Transcript\ type (and other details) may be found by clicking on the transcript\ identifier which forms the outside link to the Vega transcript details page.\ Further information on the gene and transcript classification may be found\ here.\
\ \\ Thanks to Steve Searle at the\ Sanger Institute \ for providing the GTF and FASTA files for the Vega annotations. Vega gene annotations are \ generated by manual annotation from the following groups:\
\
Chromosome 6:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Mungall AJ et al.,\
The DNA sequence and analysis of human \
\ chromosome 6. \
Nature. 2003 Oct 23;425:805-11.
\
Chromosome 7:\
\ Hillier et al., \
\ University of Washington Genome Center
\
\ Relevant publication: Hillier LW et al., \
The DNA sequence of human \
\ chromosome 7. \
Nature. 2003 Jul 10;424:157-64.
\
Chromosome 9:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
Relevant publication: Humphray SJ et al., \
The DNA sequence and analysis of human chromosome 9. \
Nature. 2004 May 27;429;369-74.
\
Chromosome 10:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Deloukas P et al., \
The DNA sequence and comparative analysis of human chromosome 10. \
Nature. 2004 May 27;429:375-81.
\
Chromosome 13:\
\ The HAVANA group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Dunham A et al., \
The DNA sequence and analysis of human chromosome 13. \
Nature. 2001 Apr 1;428:522-8.
\
Chromosome 14: \
\ \
\ Genoscope
\
\ Relevant publication: Heilig R et al., \
The DNA sequence and analysis of \
\ human chromosome 14. \
Nature. 2003 Feb 6;421:601-7.
\
Chromosome 20: \
\ The HAVANA Group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Deloukas P et al., \
The DNA sequence and \
\ comparative analysis of human chromosome 20. \
Nature. 2001 Dec 20;414:865-71.
\
Chromosome 22: Chromosome 22 Group,\
\ \
\ Wellcome Trust Sanger Institute
\
\ Relevant publications:
\
\ — Collins JE et al., \
Reevaluating Human Gene Annotation: \
\ A Second-Generation Analysis of Chromosome 22. \
Genome Research. 2003 Jan;13(1):27-36.
\
\ — Dawson E et al., \
A \
\ first-generation linkage disequilibrium map of human chromosome 22. \
Nature. 2002 Aug 1;418:544-8.
\
\ — Dunham I, et al., \
The DNA sequence of human chromosome 22. \
Nature. 1999 Dec 2;402:489-95.
\
Chromosome X: \
\ The HAVANA Group, \
\ Wellcome Trust Sanger Institute
\
\ Relevant publication: Ross MT et al., \
The DNA sequence and \
\ comparative analysis of human chromosome X. \
Nature 2005 Mar 17;434:325-37.
\ These gene predictions were generated by Ensembl.
\ \\ For a description of the methods used in Ensembl gene prediction, refer to \ Hubbard, T. et al. (2002) in the References section below.
\ \\ Thanks to Ensembl for providing this annotation.
\ \\ Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J,\ Curwen V, Down T, et al. \ The Ensembl genome database project.\ Nucleic Acids Res. 2002 Jan 1;30(1):38-41.
\ \ genes 1 acembly Acembly Genes genePred acemblyPep acemblyMrna AceView Gene Models With Alt-Splicing 1 41 155 0 125 205 127 190 0 0 0 http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?db=hg15&l=$$\ This track shows AceView gene models constructed from\ mRNA, EST and genomic evidence by Danielle and Jean Thierry-Mieg\ and Vahan Simonyan using the \ Acembly program.
\ \\ This track follows the display conventions for \ gene prediction \ tracks. Gene models that fall into the "main" prediction class\ are displayed in purple; "putative" \ genes are displayed in pink.
\\ The track description page offers the following filter and configuration\ options:\
\ AceView attempts to find the best alignment of each mRNA/EST against the\ genome, and clusters the alignments into the least possible number of\ alternatively spliced transcripts. The reconstructed transcripts are then\ clustered into genes by simple transitive contact. To see the evidence that \ supports each transcript, click the "Outside Link" on an individual \ transcript's details page to access the NCBI AceView web site.
\\ Each AceView transcript model has a gene cluster designation\ (alternate name) that is categorized into a prediction class\ of either main or \ putative.
\\
Prediction Class: main \
Class of genes that includes the protein coding genes (defined\
here by CDS > 100 amino acids) and all genes with at least one\
well-defined standard intron, i.e., an intron with a GT-AG or GC-AG\
boundary, supported by at least one clone matching exactly, with\
no ambiguous bases, and the 8 bases on either side of the intron \
identical to the genome. Genes with a CDS smaller than 100 amino acids are\
included in this class if they meet one of the following conditions: they \
have a NCBI RefSeq sequence (NM_#) or an OMIM identifier, or they encode a \
protein with BlastP homology (< 1e-3) to a cDNA-supported nematode AceView \
protein.
\
Prediction Class: putative\
Class of genes that have no standard intron and do not\
encode CDS of more than 100 amino acids, yet may be sufficiently useful to \
justify not disregarding them completely. Putative genes may be of two\
types: either those supported by more than six cDNA clones or those that\
encode a putative protein with an interesting annotation. Examples include\
a PFAM motif, a BlastP hit to a species other than itself (< 1e-3), \
a transmembrane domain or other rare and meaningful domains\
identified by Psort2, or a highly probable localization in a cell\
compartment (excluding cytoplasm and nucleus).
\ Thanks to Danielle and Jean \ Thierry-Mieg at NIH for providing this track.
\ \\ Thierry-Mieg D, Thierry-Mieg J. \ AceView: a comprehensive cDNA-supported gene and transcripts \ annotation.\ Genome Biol. 2006;7 Suppl 1:S12.1-14.
\ genes 1 urlLabel AceView Gene Summary:\ twinscan Twinscan genePred twinscanPep Twinscan Gene Predictions Using Mouse/Human Homology 0 45 0 100 100 0 50 50 0 0 0\ The Twinscan program predicts genes in a manner similar to Genscan, except \ that Twinscan takes advantage of genome comparisons to improve gene prediction\ accuracy. More information and a web server can be found at\ \ http://mblab.wustl.edu/. The Feb. 2002 (mm2) mouse assembly was used to\ create this annotation.
\ \\ This track follows the display conventions for\ gene prediction\ tracks.
\\ The track description page offers the following filter and configuration\ options:\
\ The Twinscan algorithm is described in Korf, I. et al. (2001) in the\ References section below.
\ \\ Thanks to Michael Brent's Computational Genomics Group at Washington\ University St. Louis for providing these data.
\ \\ Korf I, Flicek P, Duan D, Brent MR.\ Integrating genomic homology into gene structure prediction.\ Bioinformatics. 2001 Jun 1;17(90001)S140-8.
\ genes 1 sgpGene SGP Genes genePred sgpPep SGP Gene Predictions Using Mouse/Human Homology 0 47 0 90 100 127 172 177 0 0 0\ This track shows gene predictions from the SGP program, which is being developed at \ the Grup de Recerca en\ Informàtica Biomèdica (GRIB) at Institut Municipal d'Investigació Mèdica (IMIM) in \ Barcelona. To predict genes in a genomic\ query, SGP combines geneid predictions with tblastx comparisons of the \ genomic query against other genomic sequences. In this particular annotation, \ the Feb. 2003 (mm3) assembly of the mouse genome was used to find homology \ evidence between the two genomes.\
\\ Thanks to GRIB for providing these gene predictions.\
\ \ \ \ genes 1 softberryGene Fgenesh++ Genes genePred softberryPep Fgenesh++ Gene Predictions 0 48 0 100 0 127 177 127 0 0 0\ Fgenesh++ predictions are based on Softberry's gene-finding software.
\ \\ Fgenesh++ uses both hidden Markov models (HMMs) and protein similarity to \ find genes in a completely automated manner. For more information, see \ Solovyev, V.V. (2001) in the References section below.
\ \\ The Fgenesh++ gene predictions were produced by \ Softberry Inc. \ Commercial use of these predictions is restricted to viewing in \ this browser. Please contact Softberry Inc. to make arrangements for further \ commercial access.
\ \\ Solovyev, V.V. \ "Statistical approaches in Eukaryotic gene prediction" in the \ Handbook of Statistical Genetics (ed. Balding, D. et al.), \ 83-127. John Wiley & Sons, Ltd. (2001).
\ genes 1 geneid Geneid Genes genePred geneidPep Geneid Gene Predictions 0 49 0 90 100 127 172 177 0 0 0\ This track shows gene predictions from the geneid program developed at the \ Genome Bionformatics \ Laboratory (GBL), which is part of the \ Grup de Recerca\ en Informàtica Biomèdica (GRIB) at the Institut Municipal d'Investigació \ Mèdica (IMIM) / Centre de Regulació Genòmica (CRG) in Barcelona."\ \ \
\\ Geneid is a program to predict genes in anonymous genomic sequences designed \ with a hierarchical structure. In the first step, splice sites, start and stop \ codons are predicted and scored along the sequence using Position Weight Arrays \ (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the \ scores of the defining sites, plus the the log-likelihood ratio of a \ Markov Model for coding DNA. Finally, from the set of predicted exons, the gene \ structure is assembled, maximizing the sum of the scores of the assembled exons. \
\\ Thanks to GBL for providing these data.\
\ genes 1 genscan Genscan Genes genePred genscanPep Genscan Gene Predictions 0 50 170 100 0 212 177 127 0 0 0\ This track shows predictions from the \ Genscan program \ written by Chris Burge.\ The predictions are based on transcriptional, \ translational, and donor/acceptor splicing signals, as well as the length \ and compositional distributions of exons, introns and intergenic regions.
\ \\ This track follows the display conventions for \ gene prediction \ tracks. \
\ The track description page offers the following filter and configuration\ options:\
\ For a description of the Genscan program and the model that underlies it, \ refer to Burge and Karlin (1997) in the References section below. \ The splice site models used are described in more detail in Burge (1998)\ below.
\ \\ Burge C. \ Modeling Dependencies in Pre-mRNA Splicing Signals. \ In Salzberg S, Searls D, Kasif S, eds. \ Computational Methods in Molecular Biology, \ Elsevier Science, Amsterdam. 1998;127-163.
\\ Burge C, Karlin S. \ Prediction of Complete Gene Structures in Human Genomic DNA.\ J. Mol. Biol. 1997 Apr 25;268(1):78-94.
\ genes 1 rnaGene RNA Genes bed 6 + Non-coding RNA Genes (dark) and Pseudogenes (light) 0 52 170 80 0 230 180 130 0 0 0\ This track shows the location of non-protein coding RNA genes and\ pseudogenes. \
\ Feature types include:\
\ \
\
Eddy-tRNAscanSE (tRNA genes, Sean Eddy):
\
tRNAscan-SE 1.23 with default parameters.\
Score field contains tRNAscan-SE bit score; >20 is good, >50 is great.
\
Eddy-BLAST-tRNAlib (tRNA pseudogenes, Sean Eddy):
\
Wublast 2.0, with options "-kap wordmask=seg B=50000 \
W=8 cpus=1".\
Score field contains % identity in blast-aligned region.\
Used each of 602 tRNAs and pseudogenes predicted by tRNAscan-SE\
in the human oo27 assembly as queries. Kept all nonoverlapping\
regions that hit one or more of these with P <= 0.001.
\
Eddy-BLAST-snornalib (known snoRNAs and snoRNA pseudogenes, Steve Johnson):
\
Wublastn 2.0, with options "-V=25 -hspmax=5000 -kap wordmask=seg \
B=5000 W=8 cpus=1".\
Score field contains blast score.\
Used each of 104 unique snoRNAs in snorna.lib as a query.\
Any hit >=95% full length and >=90% identity is annotated as a\
"true gene".\
Any other hit with P <= 0.001 is annotated as a "related \
sequence" and interpreted as a putative pseudogene.
\
Eddy-BLAST-otherrnalib \
(non-tRNA, non-snoRNA noncoding RNAs with GenBank entries\
for the human gene.):
\
Wublastn 2.0 [15 Apr 2002]\
with options: "-kap -cpus=1 -wordmask=seg -W=8 -E=0.01 -hspmax=0\
-B=50000 -Z=3000000000". Exceptions to this are:\
\ The score field contains the blastn score.\ 41 unique miRNAs and 29 other ncRNAs were used as queries.\ Any hit >=95% full length and >=95% identity is annotated as a\ "true gene".\ Any other hit with P <= 0.001 and >= 65% identity is annotated\ as a "related sequence". There is an exception to this:\ all miRNAs consist of 16-26 bp sequences in GenBank\ and are annotated only if they are 100% full length and have\ 100% identity. The set of miRNAs used consists of Let-7 from\ Pasquinelli et al. (2000) and 40 miRNAs from Mourelatos et al. (2002),\ as mentioned in the references section below.\ \
\ These data were kindly provided by Sean Eddy at Washington University.
\ \\ Pasquinelli AE, Reinhart BJ, Slack F, Martindale MQ, Kuroda MI, Maller B,\ Hayward DC, Ball EE, Degnan B, Müller P, et al.\ \ Conservation of the sequence and temporal expression of let-7 \ heterochronic regulatory RNA. Nature.\ 2000 Nov 2;408(6808):86-9.
\\ Mourelatos Z, Dostie J, Paushkin S, Sharma A, Charroux B, Abel L,\ Rappsilber J, Mann M, Dreyfuss G.\ \ miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs.\ Genes Dev. 2002 Mar 15;16(6):720-8.
\ \ genes 1 mrna $Organism mRNAs psl . $Organism mRNAs from GenBank 3 54 0 0 0 127 127 127 1 0 0\ The mRNA track shows alignments between $organism mRNAs\ in GenBank and the genome.
\\ NOTE: As of April, 2008, we are again including GenBank sequences \ that contain the following URL as part of the record:\
\ http://fulllength.invitrogen.com\\ Some transcripts from this library are suspect. Some of these entries are the\ result of alignment to pseudogenes, followed by a "correction" of\ the mRNA to match the genomic sequence. It is therefore not the sequence of\ the actual mRNA and makes it appear that the mRNA is transcribed. Invitrogen\ no longer sells the clones. This collection also contains a large number of \ very good transcript evidence that can be quite useful. We \ have created a table, gbWarn, which lists problematic mRNAs and ESTs \ by accession. In a future release, we will be providing a better indication \ of the problematic transcripts in the genome browser display.\ \ \
\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\ This track may also be configured to display codon coloring, a feature that\ allows the user to quickly compare mRNAs against the genomic sequence. For more \ information about this option, click \ here.\ Several types of alignment gap may also be colored; \ for more information, click \ here.\
\ \\ GenBank $organism mRNAs were aligned against the genome using the \ blat program. When a single mRNA aligned in multiple places, \ the alignment having the highest base identity was found. \ Only alignments having a base identity level within 0.5% of\ the best and at least 96% base identity with the genomic sequence were kept.\
\ \\ The mRNA track was produced at UCSC from mRNA sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J,\ Wheeler DL.\ GenBank: update. Nucleic Acids Res.\ 2004 Jan 1;32(Database issue):D23-6.
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ rna 1 baseColorDefault diffCodons\ baseColorUseCds genbank\ baseColorUseSequence genbank\ indelDoubleInsert on\ indelPolyA on\ indelQueryInsert on\ showDiffBasesAllScales .\ intronEst Spliced ESTs psl est $Organism ESTs That Have Been Spliced 1 56 0 0 0 127 127 127 1 0 0\ This track shows alignments between $organism expressed sequence tags \ (ESTs) in GenBank and the genome that show signs of splicing when\ aligned against the genome. ESTs are single-read sequences, typically about \ 500 bases in length, that usually represent fragments of transcribed genes.\
\\ To be considered spliced, an EST must show \ evidence of at least one canonical intron, i.e. the genomic \ sequence between EST alignment blocks must be at least 32 bases in \ length and have GT/AG ends. By requiring splicing, the level \ of contamination in the EST databases is drastically reduced\ at the expense of eliminating many genuine 3' ESTs.\ For a display of all ESTs (including unspliced), see the \ $organism EST track.
\ \\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, darker shading\ indicates a larger number of aligned ESTs.
\\ The strand information (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\ Several types of alignment gap may also be colored; \ for more information, click \ here.\
\ \\ To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector and a read is taken from the 5'\ and/or 3' primer. For most — but not all — ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.
\\ In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries cover transcription start reasonably well. Before the \ cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to retrieve sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination. Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism.
\\ To generate this track, $organism ESTs from GenBank were aligned \ against the genome using blat. Note that the maximum intron length\ allowed by blat is 750,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligned in multiple places, the alignment having the \ highest base identity was identified. Only alignments having\ a base identity level within 0.5% of the best and at least 96% base identity \ with the genomic sequence are displayed in this track.
\ \\ This track was produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, \ Wheeler DL. \ GenBank: update. Nucleic Acids Res.\ 2004 Jan 1;32(Database issue):D23-6.
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ rna 1 baseColorUseSequence genbank\ indelDoubleInsert on\ indelQueryInsert on\ intronGap 30\ maxItems 300\ showDiffBasesAllScales .\ est $Organism ESTs psl est $Organism ESTs Including Unspliced 0 57 0 0 0 127 127 127 1 0 0\ This track shows alignments between $organism expressed sequence tags \ (ESTs) in GenBank and the genome. ESTs are single-read sequences, \ typically about 500 bases in length, that usually represent fragments of \ transcribed genes.
\\ NOTE: As of April, 2007, we no longer include GenBank sequences \ that contain the following URL as part of the record:\
\ http://fulllength.invitrogen.com\\ Some of these entries are the result of alignment to pseudogenes,\ followed by "correction" of the EST to match the genomic sequence. \ It is therefore not the sequence of the actual EST and makes it appear that \ the EST is transcribed. Invitrogen no longer sells the clones.\ \ \
\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The strand information (+/-) indicates the\ direction of the match between the EST and the matching\ genomic sequence. It bears no relationship to the direction\ of transcription of the RNA with which it might be associated.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\ Several types of alignment gap may also be colored; \ for more information, click \ here.\
\ \\ To make an EST, RNA is isolated from cells and reverse\ transcribed into cDNA. Typically, the cDNA is cloned\ into a plasmid vector and a read is taken from the 5'\ and/or 3' primer. For most — but not all — ESTs, the\ reverse transcription is primed by an oligo-dT, which\ hybridizes with the poly-A tail of mature mRNA. The\ reverse transcriptase may or may not make it to the 5'\ end of the mRNA, which may or may not be degraded.
\\ In general, the 3' ESTs mark the end of transcription\ reasonably well, but the 5' ESTs may end at any point\ within the transcript. Some of the newer cap-selected\ libraries cover transcription start reasonably well. Before the \ cap-selection techniques\ emerged, some projects used random rather than poly-A\ priming in an attempt to retrieve sequence distant from the\ 3' end. These projects were successful at this, but as\ a side effect also deposited sequences from unprocessed\ mRNA and perhaps even genomic sequences into the EST databases.\ Even outside of the random-primed projects, there is a\ degree of non-mRNA contamination. Because of this, a\ single unspliced EST should be viewed with considerable\ skepticism.
\\ To generate this track, $organism ESTs from GenBank were aligned \ against the genome using blat. Note that the maximum intron length\ allowed by blat is 750,000 bases, which may eliminate some ESTs with very \ long introns that might otherwise align. When a single \ EST aligned in multiple places, the alignment having the \ highest base identity was identified. Only alignments having\ a base identity level within 0.5% of the best and at least 96% base identity \ with the genomic sequence were kept.
\ \\ This track was produced at UCSC from EST sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J,\ Wheeler DL.\ GenBank: update. Nucleic Acids Res.\ 2004 Jan 1;32(Database issue):D23-6.
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ rna 1 baseColorUseSequence genbank\ indelDoubleInsert on\ indelQueryInsert on\ intronGap 30\ maxItems 300\ xenoMrna Other mRNAs psl xeno Non-$Organism mRNAs from GenBank 0 63 0 0 0 127 127 127 1 0 0\ This track displays translated blat alignments of vertebrate and\ invertebrate mRNA in \ GenBank from organisms other than $organism.\ \
\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The strand information (+/-) for this track is in two parts. The\ first + indicates the orientation of the query sequence whose\ translated protein produced the match (here always 5' to 3', hence +).\ The second + or - indicates the orientation of the matching \ translated genomic sequence. Because the two orientations of a DNA \ sequence give different predicted protein sequences, there are four \ combinations. ++ is not the same as --, nor is +- the same as -+.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\ This track may also be configured to display codon coloring, a feature that\ allows the user to quickly compare mRNAs against the genomic sequence. For more \ information about this option, click \ here.\ Several types of alignment gap may also be colored; \ for more information, click \ here.\
\ \\ The mRNAs were aligned against the $organism genome using translated blat. \ When a single mRNA aligned in multiple places, the alignment having the \ highest base identity was found. Only those alignments having a base \ identity level within 1% of the best and at least 25% base identity with the \ genomic sequence were kept.
\ \\ The mRNA track was produced at UCSC from mRNA sequence data\ submitted to the international public sequence databases by \ scientists worldwide.
\ \\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, \ Wheeler DL. \ GenBank: update. Nucleic Acids Res.\ 2004 Jan 1;32(Database issue):D23-6.
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ rna 1 baseColorUseCds genbank\ baseColorUseSequence genbank\ indelDoubleInsert on\ indelQueryInsert on\ showDiffBasesAllScales .\ xenoEst Other ESTs psl xeno Non-$Organism ESTs from GenBank 0 65 0 0 0 127 127 127 1 0 0 http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?form=4&db=n&term=$$\ This track displays translated blat alignments of expressed sequence tags \ (ESTs) in GenBank from organisms other than $organism.\ ESTs are single-read sequences, typically about 500 bases in length, that \ usually represent fragments of transcribed genes.
\ \\ This track follows the display conventions for \ PSL alignment tracks. In dense display mode, the items that\ are more darkly shaded indicate matches of better quality.
\\ The strand information (+/-) for this track is in two parts. The\ first + or - indicates the orientation of the query sequence whose\ translated protein produced the match. The second + or - indicates the\ orientation of the matching translated genomic sequence. Because the two\ orientations of a DNA sequence give different predicted protein sequences,\ there are four combinations. ++ is not the same as --, nor is +- the same\ as -+.
\\ The description page for this track has a filter that can be used to change \ the display mode, alter the color, and include/exclude a subset of items \ within the track. This may be helpful when many items are shown in the track \ display, especially when only some are relevant to the current task.
\\ To use the filter:\
\ This track may also be configured to display base labeling, a feature that\ allows the user to display all bases in the aligning sequence or only those \ that differ from the genomic sequence. For more information about this option,\ click \ here.\ Several types of alignment gap may also be colored; \ for more information, click \ here.\
\ \\ To generate this track, the ESTs were aligned against the genome using \ blat. When a single EST aligned in multiple places, the \ alignment having the highest base identity was found. Only alignments \ having a base identity level within 0.5% of the best and at least 96% base \ identity with the genomic sequence were kept.
\ \\ This track was produced at UCSC from EST sequence data submitted to the \ international public sequence databases by scientists worldwide.
\ \\ Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and \ Wheeler, D.L. \ GenBank: update. Nucleic Acids Res. 32,\ D23-6 (2004).
\\ Kent, W.J.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 12(4), 656-664 (2002).
\ \ rna 1 baseColorUseSequence genbank\ indelDoubleInsert on\ indelQueryInsert on\ uniGene_2 UniGene bed 12 . UniGene Hs 160 Alignments and SAGEmap Info 0 69 0 0 0 127 127 127 1 0 0\ Serial analysis of gene expression (SAGE)\ is a quantitative measurement of gene expression. Data are presented for every\ cluster contained in the browser window and the selected cluster name is \ highlighted in the table. All data are from the repository at the \ SageMap project \ built on UniGene version Hs 160. Click on a UniGene cluster name on the track\ details page to display SageMap's page for that cluster. Please note that data \ are not available for every cluster. There is no data available for clusters\ that lie entirely within the bounds of larger clusters.
\ \\ SAGE counts are produced by sequencing small "tags" of DNA believed \ to be associated with a gene. These tags were generated by attaching \ poly-A RNA to oligo-dT beads. After synthesis of double-stranded cDNA, \ transcripts were cleaved by an anchoring enzyme (usually NlaIII). Then, small \ tags were produced by ligation with a linker containing a type IIS restriction\ enzyme site and cleavage with the tagging enzyme (usually BsmFI). The \ tags were concatenated together and sequenced. The frequency of each \ tag was counted and used to infer expression level of transcripts that could\ be matched to that tag.
\ \\ All SAGE data presented here were mapped to UniGene transcripts by the \ SageMap project at NCBI.
\ \\ This track shows the boundaries of genes and the direction of\ transcription as deduced from clustering spliced ESTs and mRNAs\ against the genome. When many spliced variants of the same gene exist, \ this track shows the variant that spans the greatest distance in the \ genome.
\ \\ ESTs and mRNAs from \ GenBank were aligned against the genome using blat.\ Alignments with less than 97.5% base identity within the aligning blocks \ were filtered out. When multiple alignments occurred, only those\ alignments with a percentage identity within 0.2% of the\ best alignment were kept. The following alignments were also discarded: \ ESTs that aligned without any introns, blocks smaller than 10 bases, and \ blocks smaller than 130 bases that were not located next to an intron. \ The orientations of the ESTs and mRNAs were deduced from the GT/AG splice \ sites at the introns; ESTs and mRNAs with overlapping blocks\ on the same strand were merged into clusters. Only the\ extent and orientation of the clusters are shown in this track.
\\ Scores for individual gene boundaries were assigned based on the number of \ cDNA alignments used:\
\ This track, which was originally developed by Jim Kent,\ was generated at UCSC and uses data submitted to GenBank by \ scientists worldwide.
\ \\ Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. \ GenBank: update. Nucleic Acids Res. \ 2004 Jan 1;32:D23-6.
\\ Kent WJ.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 2002 Apr;12(4):656-64.
\ rna 1 nci60 NCI60 expRatio Microarray Experiments for NCI 60 Cell Lines 0 84 0 0 0 127 127 127 0 0 0 \Expression data from "Systematic variation in gene expression \ patterns in human cancer cell lines" \ [pubmed], Ross et al., Nature Genetics 2000 Mar; 24(3):227-35. \ cDNA microarrays were\ used to explore the variation in expression of approximately 8,000\ unique genes among the 60 cell lines used in the National Cancer\ Institute's screen for anti-cancer drugs. The authors have provided a\ web supplement \ where more data and experimental description can be obtained. cDNA\ probes were placed on the draft human genome using genebank sequences\ referenced by the IMAGE clone ids. \ \
The data are shown in a tabular format in which each column of\ colored boxes represents the variation in transcript levels for a\ given cDNA across all of the array experiments, and each row\ represents the measured transcript levels for all genes in a single\ sample. The variation in transcript levels for each gene is\ represented by a color scale, in which red indicates an increase in\ transcript levels, and green indicates a decrease in transcript\ levels, relative to the reference sample. The saturation of the color\ corresponds to the magnitude of transcript variation. A black color\ indicates an undetectable change in expression, while a gray box\ indicates missing data.\ \
Combine Arrays: This option is only valid when the track is \ displayed in full. It determines how the experiments are displayed. The\ options are:\
\ This track shows expression data from GNF (The Genomics Institute of the Novartis Research \ Foundation) using Affymetrix GeneChips. The chip types, chip IDs or tissue \ averages associated with experiments can be displayed by selecting the \ appropriate option from the Experiment Display menu on the track \ description page. For more information, see the Track Configuration section.\
\ \\ For detailed information about the experiments, see Su et al. 2002 \ in the References section below. Alignments displayed on the track correspond \ to the target sequences used by Affymetrix to choose probes.
\\ In dense display mode, the track color denotes the average signal over all\ experiments on a log base 2 scale. Lighter colors correspond to lower \ signals and darker colors correspond to higher signals. In full display\ mode, the color of each item represents the log base 2 ratio of the signal \ of that particular experiment to the median signal of all experiments for \ that probe.
\\ More information about individual probes and probe sets is available on the\ Affymetrix website.
\ \\ This track may be configured to change the display mode and colors or \ vary the type of experiment information shown. The configuration controls are\ located at the top of the track description page, which is accessed via \ the small button to the left of the track's graphical display or the link \ on the track's control menu. \
\ When you have finished making changes, click the Submit button to\ commit your changes and return to the Genome Browser tracks display.
\ \Thanks to GNF for providing these data.
\ \\ Su, A.I., Cooke, M.P., Ching, K.A., Hakak, Y., Walker, J.R., Wiltshire, T., \ Orth, A.P., Vega, R.G., Sapinoso, L.M., Moqrich, A. et al. \ Large-scale analysis of the human and mouse transcriptomes. \ Proc Natl Acad Sci USA 99(7), 4465-70 (2002).
\ regulation 1 expDrawExons on\ expScale 3.0\ expStep 0.5\ expTable affyExps\ groupings affyRatioGroups\ affyU95 Affy U95 psl . Alignments of Affymetrix Consensus/Exemplars from HG-U95 0 89.2 0 0 0 127 127 127 0 0 0\ This track shows the location of the consensus and exemplar sequences used \ for the selection of probes on the Affymetrix HG-U95Av2 chip. For this chip, \ probes are predominantly designed from consensus sequences.
\ \\ Consensus and exemplar sequences were downloaded from the\ Affymetrix Product Support\ and mapped to the genome using blat followed by pslReps with the \ parameters:
-minCover=0.3 -minAli=0.95 -nearTop=0.005\\ \
\ Thanks to Affymetrix \ for the data underlying this track.
\ regulation 1 cpgIsland CpG Islands bed 4 + CpG Islands (Islands < 300 Bases are Light Green) 0 90 0 100 0 128 228 128 0 0 0\ CpG islands are associated with genes, particularly housekeeping\ genes, in vertebrates. CpG islands are typically common near\ transcription start sites, and may be associated with promoter\ regions. Normally a C (cytosine) base followed immediately by a \ G (guanine) base (a CpG) is rare in\ vertebrate DNA because the Cs in such an arrangement tend to be\ methylated. This methylation helps distinguish the newly synthesized\ DNA strand from the parent strand, which aids in the final stages of\ DNA proofreading after duplication. However, over evolutionary time\ methylated Cs tend to turn into Ts because of spontaneous\ deamination. The result is that CpGs are relatively rare unless\ there is selective pressure to keep them or a region is not methylated\ for some reason, perhaps having to do with the regulation of gene\ expression. CpG islands are regions where CpGs are present at\ significantly higher levels than is typical for the genome as a whole.\
\ \\ CpG islands are predicted by searching the sequence one base at a\ time, scoring each dinucleotide (+17 for CG and -1 for others) and\ identifying maximally scoring segments. Each segment is then\ evaluated for the following criteria:\
\ The CpG count is the number of CG dinucleotides in the island. \ The Percentage CpG is the ratio of CpG nucleotide bases\ (twice the CpG count) to the length.
\ \\ This track was generated using a modification of a program developed by \ G. Miklem and L. Hillier.
\ \ regulation 1 firstEF FirstEF bed 6 . FirstEF: First-Exon and Promoter Prediction 0 90.1 0 0 0 127 127 127 1 0 0 http://rulai.cshl.org/tools/FirstEF/Readme/README.htmlThis track shows predictions from\ the FirstEF\ (First Exon Finder) program.
\ \
Three types of predictions are displayed: exon, promoter and CpG window. \ If two consecutive predictions are separated by less than 1000 bp, \ FirstEF treats them as one cluster of alternative first exons that may \ belong to same gene. The cluster number is displayed in the parentheses \ of each item. For example, "exon(405-)" \ represents the exon prediction in cluster number 405 on the minus strand. \ The exon, promoter and CpG-window are interconnected by this cluster number. \ Alternative predictions within the same cluster are denoted by "#N" \ where "N" is the serial number of an alternative prediction in the \ cluster.
\ \Each predicted exon is either CpG-related or non-CpG-related, based on\ a score of the frequency of CpG dinucleotides.\ An exon is classified as CpG-related if the CpG score is greater \ than a threshold value, and non-CpG-related if less than the threshold. If an \ exon is CpG-related, \ its associated CpG-window is displayed. The browser displays features with higher\ scores in darker shades of gray/black.
\ \FirstEF is a 5' terminal exon and promoter\ prediction program. It consists of different discriminant functions structured\ as a decision tree. The probabilistic models are optimized to find potential\ first donor sites and CpG-related and non-CpG-related promoter regions based on\ discriminant analysis. For every potential first donor site (GT) and an upstream\ promoter region, FirstEF decides whether or not the intermediate region can be\ a potential first exon, based on a set of quadratic discriminant functions.\ FirstEF calculates the a posteriori probabilities of exon, donor, and\ promoter for a given GT and an upstream window of length 570 bp.
\ \For a description of the FirstEF program and the underlying classification \ models, refer to Davuluri et al., 2001. \ \
The predictions for this track are produced by Ramana V.\ Davuluri of Ohio State University and Ivo Grosse and\ Michael Q. Zhang of Cold Spring Harbor Lab.\ \
\ Davuluri RV, Grosse I, Zhang MQ.\ Computational identification of promoters and first exons in the \ human genome. \ Nat Genet. 2001 Dec;29(4):412-7.
\ regulation 1 scoreMax 1000\ scoreMin 500\ tfbsCons TFBS Conserved bed 6 + HMR Conserved Transcription Factor Binding Sites 0 94 0 0 0 127 127 127 1 0 0 http://www.gene-regulation.com/cgi-bin/pub/databases/transfac/getTF.cgi?AC=$$\ This track contains the location and score of transcription factor\ binding sites conserved in the human/mouse/rat alignment. A binding\ site is considered to be conserved across the alignment if its score\ meets the threshold score for that binding site in all 3 species.\ The score and threshold are computed with the Transfac Matrix Database (v4.0) created by\ Biobase. \ The data are purely computational, and as such not all binding sites\ listed here are biologically functional binding sites.
\\ In the graphical display, each box represents one conserved tfbs. The\ darker the box, the better the match of the binding site. Clicking on\ a box brings up detailed information on the binding site, namely its\ Transfac I.D., a link to its Transfac Matrix (free registration with Transfac\ required), its location in the human genome (chromosome, start, end,\ and strand), its length in bases, and its score.
\\ All binding factors that are known to bind to the particular binding site\ are listed along with their species, SwissProt ID, and a link to that\ factor's page on the UCSC Protein Browser if such an entry exists.
\ \\ A binding site is considered to be conserved across the alignment if\ its score meets the threshold score for that binding site at exactly the\ same position in the alignment in all 3 species. If there is no orthologous \ sequence in the mouse or the rat, no prediction is made.\ The following is a brief discussion of the scoring and threshold system\ used for these data.
\\ The Transfac Matrix Database contains position-weight matrices for \ 336 transcription factor binding sites, as characterized through\ experimental results in the scientific literature. A typical (in this\ case ficticious) matrix will look something like:
\\\ The above matrix specifies the results of 60 (the sum of each row)\ experiments. In the experiments, the first position of the binding site\ was A 15 times, C 15 times, G 15 times, and T 15 times (and so on for\ each position.) The consensus sequence of the above binding site as\ characterized by the matrix is NNGAT. The format of the consensus sequence\ is the deduced consensus in the IUPAC 15-letter code.\\ A C G T\ 01 15 15 15 15 N\ 02 20 10 15 15 N\ 03 0 0 60 0 G\ 04 60 0 0 0 A\ 05 0 0 0 60 T\\
\ The score of a segment of DNA is computed in relation to a matrix as \ follows:\
\\ \ For example, the sequence "CCGAT" would have a score of:\ 15 + 10 + 60 + 60 + 60 = 205 for the above matrix.\ \ A score in relation to a matrix of length n can be computed for every \ DNA segment of length n.\ \\ score = SUM over each position in the matrix of\ matrix[position][nucleotide_in_segment_at_this_position].\\
\ The threshold for a binding site is computed from its Transfac Matrix\ Database entry as follows:\ \
\\ \ For example, the above matrix has a minimum score of \ 15 + 10 + 0 + 0 + 0 = 25 and a maximum score of 15 + 20+ 60 + 60 + 60 = 215.\ Using a cutoff value of 0.85 (the value used for this track), the threshold \ for the above matrix is:\\ St = Smin + ((Smax - Smin) * C)\ \ where St is the target threshold score\ Smin is the minimum possible score\ Smax is the maximum possible score\ C is the cutoff value used by the scoring function\\
\\ \ As such the sequence "CCGAT" from above would be recorded as a hit with a \ cutoff value of 0.85, since its score (215) exceeds the threshold for this \ particular binding site (186.5.)\\ 25 + ((215 - 25) * 0.85) = 186.5\\
\ The final score reported is the minimum cutoff value that the position would \ have been recorded as a hit (multiplied by 1000.) The final score of the \ above example is therefore:\ \
\\ Therefore, the final score for the sequence "CCGAT" would be 947.\ Although the scores of all three species in the alignment must exceed the\ threshold, the only final score that is reported for this track is the \ final score of the binding site in the human sequence.\\ ((Score - Smin) / (Smax - Smin)) * 1000 = (205 - 25) / (215 - 25)) = 0.947 * 1000 = 947.\\
\ It should be noted that the positions of many of these conserved binding\ sites coincide with known exons and other highly conserved regions.\ Regions such as these are more likely to contain false positive matches,\ as the high sequence identity across the alignment increases the likelihood of\ a short motif that looks like a binding site to be conserved. Conversely,\ matches found in introns and intergenic regions are more likely to be real\ binding sites, since these regions are mostly poorly conserved.\
\\ These data were obtained by running the program tfloc (Transcription\ Factor binding site LOCater) on multiz humor25 alignments of the Feb. 2003 mouse\ draft assembly (mm3) and the Jan. 2003 rat assembly (rn2) to the Apr. 2003 human \ genome assembly (hg15.) Tfloc was run on the subset of the Transfac Matrix\ Database containing human, mouse, and rat related binding sites (164 total.)\ Transcription factor information was culled from the Transfac Factor database.
\ \\
\ These data were generated using the Transfac Matrix and Factor databases created by\ Biobase.\
\ The tfloc program was developed at The Pennsylvania State University \ by Matt Weirauch.
\\ This track was created by Matt Weirauch and Brian Raney at The\ University of California at Santa Cruz.
\ regulation 1 scoreMax 1000\ scoreMin 830\ urlLabel Transfac matrix link:\ affyTxnPhase2 Affy Txn Phase2 wig 0 1000 Affymetrix Transcriptome Project Phase 2 0 98.9 0 0 0 127 127 127 0 0 0\ This track displays transcriptome data from tiling GeneChips produced\ by Affymetrix. For the ten chromosomes 6, 7, 13, 14,\ 19, 20, 21, 22, X, and Y, more than 74 million probes were tiled every\ 5 bp in non-repeat-masked areas and hybridized to mRNA from 11\ different cell lines (some cell lines were female and contain no data\ for chrY). For HepG2, some samples were depleted\ of polyA transcripts rather than enriched. For experimental details\ and results, see Cheng et al. in the References section\ below.
\\ This annotation follows the display conventions for composite \ tracks. The subtracks within this annotation may be configured in a variety of \ ways to highlight different aspects of the displayed data. The graphical \ configuration options are shown at the top of the track description page, \ followed by a list of subtracks. For more information about the \ graphical configuration options, click the \ Graph configuration \ help link. To display only selected subtracks, uncheck the boxes next to \ the tracks you wish to hide.
\\ Each subtrack is colored blue in areas that are thought to be transcribed\ at a statistically significant level as described in the accompanying\ transfrags (transcribed fragments) track. Transfrags that have a\ significant blat hit elsewhere in the genome are displayed in a\ lighter shade of blue, and transfrags that overlap putative\ pseudogenes are colored an even lighter shade of blue. All other\ regions of the track are colored brown. While the raw data are based\ on perfect match minus mismatch (PM - MM) probe values and may contain\ negative values, the track has a minimum value of zero for visualization\ purposes.
\ \\ For each data point, probes within 30 bp on either side were used to\ improve the estimate of expression level for a particular probe. This\ helped to smooth the data and produce a more robust estimate of the\ transcription level at a particular genomic location. The following\ analysis method was used:\
\ Data generation and analysis was performed by the transcriptome group at \ Affymetrix:\ Bekiranov, S., Brubaker, S., Cheng, J., Dike, S., Drenkow, J., Ghosh, S., \ Gingeras, T., Helt, G., Kampa, D., Kapranov, P., Long, J., Madhavan, G., \ Manak, J., Patel, S., Piccolboni, A., Sementchenko, V. and Tammana, H.
\ \Questions or comments about this annotation? Email Chuck Sugnet.\ \
\ Cheng et al. \ Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide \ Resolution. Science 308(5725), 1149-54 (2005).
\ regulation 0 autoScaleDefault Off\ centerLabelsDense on\ compositeTrack on smart\ defaultViewLimits 0:150\ maxHeightPixels 100:30:10\ A375CytosolicPolyAPlusTxn A375 Txn wig A375 Cytosolic polyA+, Affy Transcriptome 0 99.01 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy A375CytosolicPolyAPlusTnFg\ A375CytosolicPolyAPlusTnFg A375 TnFg bed 4 + A375 Cytosolic polyA+, Affy Transfrags 3 99.02 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ FHs738LuCytosolicPolyAPlusTxn FHs738Lu Txn wig FHs738Lu Cytosolic polyA+, Affy Transcriptome 0 99.03 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy FHs738LuCytosolicPolyAPlusTnFg\ FHs738LuCytosolicPolyAPlusTnFg FHs738Lu TnFg bed 4 + FHs738Lu Cytosolic polyA+, Affy Transfrags 3 99.04 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ HepG2CytosolicPolyAPlusTxn HepG2+ Cyto Txn wig HepG2 Cytosolic polyA+, Affy Transcriptome 0 99.05 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy HepG2CytosolicPolyAPlusTnFg\ HepG2CytosolicPolyAPlusTnFg HepG2+ Cyto TnFg bed 4 + HepG2 Cytosolic polyA+, Affy Transfrags 3 99.06 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ HepG2NuclearPolyAPlusTxn HepG2+ Nuc Txn wig HepG2 Nuclear polyA+, Affy Transcriptome 0 99.061 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy HepG2NuclearPolyAPlusTnFg\ HepG2NuclearPolyAPlusTnFg HepG2+ Nuc TnFg bed 4 + HepG2 Nuclear polyA+, Affy Transfrags 3 99.062 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ HepG2CytosolicPolyAMinusTxn HepG2- Cyto Txn wig HepG2 Cytosolic polyA-, Affy Transcriptome 0 99.063 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy HepG2CytosolicPolyAMinusTnFg\ HepG2CytosolicPolyAMinusTnFg HepG2- Cyto TnFg bed 4 + HepG2 Cytosolic polyA-, Affy Transfrags 3 99.064 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ HepG2NuclearPolyAMinusTxn HepG2- Nuc Txn wig HepG2 Nuclear polyA-, Affy Transcriptome 0 99.065 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy HepG2NuclearPolyAMinusTnFg\ HepG2NuclearPolyAMinusTnFg HepG2- Nuc TnFg bed 4 + HepG2 Nuclear polyA-, Affy Transfrags 3 99.067 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ JurkatCytosolicPolyAPlusTxn Jurkat Txn wig Jurkat Cytosolic polyA+, Affy Transcriptome 0 99.07 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy JurkatCytosolicPolyAPlusTnFg\ JurkatCytosolicPolyAPlusTnFg Jurkat TnFg bed 4 + Jurkat Cytosolic polyA+, Affy Transfrags 3 99.08 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ NCCITCytosolicPolyAPlusTxn NCCIT Txn wig NCCIT Cytosolic polyA+, Affy Transcriptome 0 99.09 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy NCCITCytosolicPolyAPlusTnFg\ NCCITCytosolicPolyAPlusTnFg NCCIT TnFg bed 4 + NCCIT Cytosolic polyA+, Affy Transfrags 3 99.1 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ PC3CytosolicPolyAPlusTxn PC3 Txn wig PC3 Cytosolic polyA+, Affy Transcriptome 0 99.11 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy PC3CytosolicPolyAPlusTnFg\ PC3CytosolicPolyAPlusTnFg PC3 TnFg bed 4 + PC3 Cytosolic polyA+, Affy Transfrags 3 99.12 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ SKNASCytosolicPolyAPlusTxn SK-N-AS Txn wig SK-N-AS Cytosolic polyA+, Affy Transcriptome 0 99.13 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy SKNASCytosolicPolyAPlusTnFg\ SKNASCytosolicPolyAPlusTnFg SK-N-AS TnFg bed 4 + SK-N-AS Cytosolic polyA+, Affy Transfrags 3 99.14 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ U87CytosolicPolyAPlusTxn U87 Txn wig U87 Cytosolic polyA+, Affy Transcriptome 0 99.15 175 150 128 255 128 0 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 0 autoScaleDefault Off\ centerLabelsDense on\ graphTypeDefault Bar\ gridDefault OFF\ maxHeightPixels 128:36:16\ noInherit on\ subTrack affyTxnPhase2\ wigColorBy U87CytosolicPolyAPlusTnFg\ U87CytosolicPolyAPlusTnFg U87 TnFg bed 4 + U87 Cytosolic polyA+, Affy Transfrags 3 99.16 35 35 175 160 160 188 0 0 10 chr6,chr7,chr13,chr14,chr19,chr20,chr21,chr22,chrX,chrY, regulation 1 centerLabelsDense on\ noInherit on\ subTrack affyTxnPhase2\ snpMap SNPs bed 4 . Simple Nucleotide Polymorphisms (SNPs) 0 100 0 0 0 127 127 127 0 0 0\ This track consolidates all the Simple Nucleotide Polymorphisms\ into a single track. It is the union of the Overlap SNPs,\ Random SNPs, Affymetrix 120K SNP, and Affymetrix 10K SNP tracks that\ previously existed in the Genome Browser.
\ \ \ Variant Sources\\ The SNPs in this track include all known polymorphisms that\ can be mapped against the current assembly. These include known point\ mutations (Single Nucleotide Polymorphisms), insertions, deletions,\ and segmental mutations from the current build of \ dbSnp, which is \ shown in the Genome Browser release log.\
\\ There are three major cases that are not mapped and/or annotated:\
The heuristics for the non-SNP variations (i.e. named elements and\ STRs) are quite conservative; therefore, some of these are probably lost. This\ approach was chosen to avoid false annotation of variation in\ inappropriate locations.
\ \\ Positional information can be found in the annotations section\ of the Genome Browser \ downloads page, \ which is organized by species and assembly. Non-positional information\ can be found in the \ shared\ data section of the same page, where it is split into tables by\ organism: \ dbSnpRsHg for Human, \ dbSnpRsMm for Mouse, and \ dbSnpRsRn for Rat.\ \
\ Thanks to the SNP\ Consortium and NIH for providing the public data, which are available from \ dbSnp at \ NCBI.
\\ Thanks to Perlegen Sciences, \ Inc. for providing additional SNPs from their database.\ Additional information about the Perlegen SNP discovery process can be\ found in Patil, N. (2001) \ \ Blocks of Limited Haplotype Diversity Revealed by High-Resolution\ Scanning of Human Chromosome 21. Science 294:1719-1723.\
\\ Thanks to Affymetrix, Inc. for developing the genotyping\ arrays. For more details on this genotyping assay, please see the\ supplemental information on the \ Affymetrix 10K SNP and \ Affymetrix 120K SNP products. Additional information, \ including genotyping data, is available from the details pages for the \ Affymetrix 120K SNP and Affymetrix 10K SNP tracks.
\ \Please see the Terms and Conditions page on the Affymetrix website for \ restrictions on the use of their data. \ \ \ \ varRep 1 perlegen Perlegen Haplotypes bed 12 . Perlegen Common High-Resolution Haplotype Blocks 0 145.21 0 0 0 127 127 127 1 0 1 chr21,
\ Haplotype blocks derived from common single nucleotide polymorphisms (SNPs) \ on chromosome 21 by\ Perlegen Sciences, as \ described in Patil, N. et al. \ Blocks of limited haplotype diversity revealed by \ high-resolution scanning.\ Science 294, 1719-1723 (2001).
\\ The location of each haplotype block is represented by\ a blue horizontal line with tall vertical blue bars at the first and\ last SNPs of the block. Blocks are displayed as starting at the first\ SNP and ending at the last SNP of the block. This is slightly\ different from the representation on the Perlegen web site in which blocks are\ stretched until they abut each other. The shade of the blue indicates the \ minimum number of SNPs required to discriminate between haplotype patterns\ that account for at least 80% of genotyped chromosomes. Darker colors\ indicate that fewer SNPs are necessary. Individual SNPs are denoted by\ smaller black vertical bars. At multi-megabase resolution in dense\ display mode, clusters of tall blue bars may indicate hotspots for\ recombination.
\\ For more information on a particular block, click "Outside Link" \ on the item's details page. General information on the\ blocks is available from Perlegen's\ Chromosome 21 \ Haplotype Browser.
\\ NOTE: Perlegen annotations appear only on chromosome 21.
\ \\ Thanks to Perlegen Sciences for making these data available.
\ varRep 1 haplotype Haplotype Blocks bed 12 . Common Haplotype Blocks 0 145.22 0 0 0 127 127 127 1 0 1 chr22,\ Haplotype blocks on chromosome 22 from \ The University \ of Oxford and \ The Wellcome Trust Sanger \ Institute, as described in Dawson, E. et al. \ A first-generation linkage disequilibrium map of human \ chromosome 22. Nature 418, 544-8 (2002).
\\ The location of each haplotype block is represented by\ a blue horizontal line with tall vertical blue bars at the first and\ last SNPs of the block. Blocks are displayed as starting at the first\ SNP and ending at the last SNP of the block. Individual SNPs are denoted by\ smaller black vertical bars. At multi-megabase resolution in dense\ display mode, clusters of tall blue bars may indicate hotspots for\ recombination.
\\ NOTE: Haplotype block annotations appear only on chromosome 22.
\ \\ Thanks to The University of Oxford and the the Sanger Institute for providing these data.\ varRep 1 genomicSuperDups Segmental Dups bed 6 . Duplications of >1000 Bases of Non-RepeatMasked Sequence 0 146 0 0 0 127 127 127 0 0 0
This region was detected as a putative genomic duplication within the golden path.\ Orange, yellow, dark-light gray represent similarities of >99\\%, 99-98\\% and 98-90% \ respectively. Duplications greater than 98% similarity that lack sufficient SDD \ evidence (likely missed overlaps) are shown as red. Cut off values were at least \ 1 kb of total sequence aligned (containing at least 500 bp non-RepeatMasked sequence) \ and at least 90% sequence identity. \ \
\ This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences \ for interspersed repeats and low complexity DNA sequences. The program\ outputs a detailed annotation of the repeats that are present in the \ query sequence, as well as a modified version of the query sequence \ in which all the annotated repeats have been masked. RepeatMasker uses \ the RepBase library of repeats from the \ Genetic \ Information Research Institute (GIRI). \ RepBase is described in Jurka, J. (2000) in the References section below.
\ \\ In full display mode, this track displays up to ten different classes of repeats:\
\ The level of color shading in the graphical display reflects the amount of \ base mismatch, base deletion, and base insertion associated with a repeat \ element. The higher the combined number of these, the lighter the shading.
\ \\ UCSC has used the most current versions of the RepeatMasker software \ and repeat libraries available to generate these data. Note that these \ versions may be newer than those that are publicly available on the Internet. \
\\ Data are generated using the RepeatMasker -s flag. Additional flags\ may be used for certain organisms. Repeats are soft-masked. Alignments may \ extend through repeats, but are not permitted to initiate in them. \ See the \ FAQ for \ more information.
\ \\ Thanks to Arian Smit and GIRI\ for providing the tools and repeat libraries used to generate this track.
\ \\ Smit, AFA, Hubley, R and Green, P. RepeatMasker Open-3.0.\ http://www.repeatmasker.org. 1996-2007.\
\\ RepBase is described in \ Jurka J. \ Repbase update: a database and an electronic journal of \ repetitive elements. \ Trends Genet. 2000 Sep;16(9):418-420.
\\ For a discussion of repeats in mammalian genomes, see: \
\ Smit AF. Interspersed repeats and other mementos of transposable \ elements in mammalian genomes. Curr Opin Genet Dev. 1999 Dec;9(6):\ 657-63.
\\ Smit AF. The origin of interspersed repeats in the human genome. \ Curr Opin Genet Dev. 1996 Dec;6(6):743-8.\
\ varRep 0 simpleRepeat Simple Repeats bed 4 + Simple Tandem Repeats by TRF 0 149.3 0 0 0 127 127 127 0 0 0\ This track displays simple tandem repeats (possibly imperfect) located\ by Tandem Repeats\ Finder (TRF), which is specialized for this purpose. These repeats can\ occur within coding regions of genes and may be quite\ polymorphic. Repeat expansions are sometimes associated with specific\ diseases.
\ \\ For more information about the TRF program, see Benson (1999).\
\ \\ TRF was written by \ Gary Benson.
\ \\ Benson G. \ Tandem repeats finder: a program to analyze DNA sequences.\ Nucleic Acids Res. 1999 Jan 15;27(2):573-80.
\ varRep 1 blatFugu Fugu Blat psl xeno Takifugu rubripes Translated Blat Alignments 0 150 0 60 120 200 220 255 1 0 0\ The Fugu v.3.0 whole genome shotgun assembly was provided by the\ US DOE Joint \ Genome Institute (JGI). The assembly was constructed with the JGI\ assembler, JAZZ, from paired end sequencing reads produced at JGI, Myriad \ Genetics, and Celera Genomics, resulting in a sequence coverage of 5.7X. All \ reads are plasmid, cosmid, or BAC end-sequences, with the predominant coverage\ derived from 2 Kb insert plasmids. This assembly contains 20,379\ scaffolds totaling 319 million base pairs. The largest 679 scaffolds\ total 160 million base pairs.
\\ The strand information (+/-) for this track is in two parts. The\ first + or - indicates the orientation of the query sequence whose\ translated protein produced the match. The second + or - indicates the\ orientation of the matching translated genomic sequence. Because the two\ orientations of a DNA sequence give different predicted protein sequences,\ there are four combinations. ++ is not the same as --; nor is +- the same\ as -+.
\ \\ The alignments were made with blat in translated protein mode requiring two\ nearby 4-mer matches to trigger a detailed alignment. The $organism\ genome was masked with RepeatMasker and Tandem Repeat Finder before \ running blat.
\ \\ The 3.0 draft from the\ \ JGI Fugu rubripes website was used in the\ UCSC Genome Browser Fugu blat alignments. These data were freely provided \ by the JGI for use in this publication only.
\ \\ Kent, W.J.\ BLAT - the BLAST-like alignment tool.\ Genome Res. 12(4), 656-664 (2002).
\ \ compGeno 1 blastzTightRn2 Tight Rat psl xeno rn2 $o_Organism ($o_date/$o_db) Blastz Tight Subset of Best Alignments 0 174 100 50 0 255 240 200 1 0 0\ This track displays blastz alignments of the $o_organism assembly \ ($o_db, $o_date) to the $organism genome, filtered by axtBest and \ subsetAxt with very stringent constraints as described below.\ The track has an optional feature that color codes alignments to indicate \ the chromosomes from which they are derived in the aligning assembly. To \ activate the color feature, click the on button next to \ "Color track based on chromosome" on the track description page.
\\ Each item in the display is identified by the chromosome, strand, and \ location of the match (in thousands).
\ \\ For blastz, 12 of 19 seeds were used and then scored using:\
\ A C G T\ A 91 -114 -31 -123\ C -114 100 -125 -31\ G -31 -125 100 -114\ T -123 -31 -114 91\ \ O = 400, E = 30, K = 3000, L = 3000, M = 50\\
\ A second pass was made at reduced stringency (7mer seeds and\ MSP threshold of K=2200) to attempt to fill in gaps of up to about 10K bp.\ Lineage-specific repeats were abridged during this alignment.
\ AxtBest was used to select only the best alignment for any given region\ of the genome. SubsetAxt was then run on axtBest-filtered alignments \ with this matrix:\\ A C G T\ A 100 -200 -100 -200\ C -200 100 -200 -100\ G -100 -200 100 -200\ T -200 -100 -200 100\\ with a gap open penalty of 2000 and a gap extension penalty of 50. \ The minimum score threshold was 3400.\ \
\ This track has a filter that can be used to change the display mode,\ turn on the chromosome color track, or filter the display output by\ chromosome. The filter is located at the top of the track description page,\ which is accessed via the small button to the left of the track's graphical\ display or through the link on the track's control menu.\
\ When you have finished configuring the filter, click the Submit\ button.
\ \\ The alignments are contributed by Scott Schwartz from the \ Penn State Bioinformatics \ Group. The best in genome filtering is done by UCSC's \ axtBest and subsetAxt programs.
\ \\ Chiaromonte, F., Yap, V.B., Miller, W.\ Scoring pairwise genomic sequence alignments.\ Pac Symp Biocomput 2002, 115-26 (2002).
\\ Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.,\ Haussler, D., and Miller, W.\ Human-mouse alignments with BLASTZ.\ Genome Res. 13(1), 103-107 (2003).
\ \ compGeno 1 otherDb rn2\ syntenyRat Rat Synteny bed 4 + $o_Organism ($o_date/$o_db) Synteny Using Blastz Single Coverage (100k window) 0 175 0 100 0 255 240 200 0 0 0\ This track shows syntenous (corresponding) regions between $organism and \ $o_organism chromosomes. The $o_date ($o_db) assembly of the $o_organism genome \ was used to produce this annotation.
\ \\ We passed a 100k non-overlapping window over the genome and - using the blastz \ best-in-$o_organism genome alignments - looked for high-scoring regions with at \ least 40% of the bases aligning with the same region in $o_organism. 100k \ segments were joined together if they agreed in direction and were within 500kb \ of each other in the $organism genome and within 4mb of each other in the \ $o_organism. Gaps were joined between syntenic anchors if the bases between two \ flanking regions agreed with synteny (direction and $o_organism location). \ Finally, we extended the syntenic block to include those areas.
\ \\ Contact Robert \ Baertsch at UCSC for more information about this track.\ Thanks to the Rat Genome Sequencing Consortium for providing the rat sequence \ data. For more information, see the Baylor College Human Genome Sequencing Center\ Rat Genome \ Project page.
\ compGeno 1 otherDb rn2\ syntenyCow Cow Synteny bed 6 . Cow Synteny Using RH Mapping 0 178 0 100 0 255 240 200 0 0 0\ This track depicts human-cattle synteny segments as defined on the basis\ of a cattle-human comparative map containing 3,200 BAC-end sequences and\ EST markers with a single significant hit (E-value less than 0.00001) \ in the human genome sequence (hg15) as defined by the TimeLogic \ Tera-BLASTn program (Everts-van der Wind et al., 2005). \ The synteny blocks were defined according to the rules described in \ Murphy et al. (2005). \ \
\\ Thanks to Harris Lewin, Denis Larkin, and Annelie Everts-van der Wind,\ University of Illinois at Urbana-Champaign, for providing these data.\
\Everts-van der Wind, A., Larkin, D., Green, C., Elliott, J., Olmstead, C., \ Chiu, R., Schein, J., Marra, M., Womack, J., and Lewin, H.\ \ A high-resolution whole-genome cattle-human comparative map reveals details \ of mammalian chromosome evolution.\ Proc Natl Acad Sci 102(51) 18526-18531 (2005).\
\\ Murphy, W., Larkin, D., Everts-van der Wind, A., Bourque, G., Tesler, G., \ Auvil, L.,\ Beever, J., Chowdhary, B., Galibert, F., Gatzke, L., Hitte, C., Meyers, S.,\ Milan, D., Ostrander, E., Pape, G., Parker, H., Raudsepp, T., Rogatcheva, M.,\ Schook, L., Skow, L., Welge, M., Womack, J., O'Brien, S., Pevzner, P., and\ Lewin, H.\ \ Dynamics of Mammalian Chromosome Evolution Inferred from Multispecies Comparative Maps.\ Science 309(5734) 613-617 (2005).\
\ compGeno 1 bacendsCow Cow BAC Ends bed 6 . Cow BAC Ends (BLASTn) 0 179 0 100 0 255 240 200 0 0 0\ \ The track shows BLASTn results of approximately 300,000 cattle BAC-ends\ from the CHORI-240 library against the human genome sequence (hg15).\ The track displays approximately 54,000 unique BLASTn hits (E less than e-5) in\ the human genome.\
\\ Thanks to Harris Lewin and Denis Larkin, \ University of Illinois at Urbana-Champaign,\ for providing these data.\
\ compGeno 1