This directory contains the Build 36 "essentially finished" mouse genome 
(UCSC mm8, February 2006) from the Mouse Genome Sequencing Consortium. This 
assembly was produced at NCBI.  

Files included in this directory:

chromAgp.tar.gz - Description of how the assembly was generated from
     fragments, unpacking to one file per chromosome.  

chromFa.tar.gz - The assembly sequence in one file per chromosome.
    Repeats from RepeatMasker and Tandem Repeats Finder (with period
    of 12 or less) are shown in lower case; non-repeating sequence is
    shown in upper case.  RepeatMasker 2006-01-20 (open-3-1-3) version 
    with RepBase libraries: RM database version 20060120

chromFaMasked.tar.gz - The assembly sequence in one file per 
    chromosome. Repeats are masked by capital Ns; non-repeating 
    sequence is shown in upper case.  

chromOut.tar.gz - RepeatMasker .out file for chromosomes. These were 
    created by RepeatMasker at the -s sensitive setting.

chromTrf.tar.gz - Tandem Repeats Finder locations, filtered to keep 
    repeats with period of less than or equal to 12, and translated 
    into one .bed file per chromosome.  

mm8.2bit - contains the complete mm8 Mouse Genome
    in the 2bit format.  A utility program, twoBitToFa (available
    from our src tree), can be used to extract .fa file(s) from
    this file.  See also: - CVS access to the source tree - building the utilities

md5sum.txt - MD5 checksum of these files to verify correct transmission

upstream1000.fa.gz - Sequences 1000 bases upstream of annotated
    transcription starts for RefSeq genes with annotated 5' UTRs.  
    This file is updated weekly so it could be slightly out      
    of sync with the RefSeq data which is updated daily for most

upstream2000.fa.gz - Same as upstream1000, but 2000 bases.

upstream5000.fa.gz - Same as upstream1000, but 5000 bases.

mm8.chrom.sizes - Two-column tab-separated text file containing assembly
    sequence names and sizes.

If you plan to download a large file or multiple files from this directory, 
we recommend you use rsync, wget, or ftp rather than downloading the
files via our website. To do so, anonymous ftp to,
go to the directory goldenPath/mm8/bigZips/.

To download multiple files via ftp, use the "mget" command:
mget <filename1> <filename2> ...
    - or -
    mget -a (to download all the files in the directory) 

The rsync command to download the entire directory:
    rsync -avzP rsync:// .
For a single file, e.g. chromFa.tar.gz
    rsync -avzP \
	rsync:// .

Or with wget, all files:
    wget --timestamping \
With wget, a single file:
    wget --timestamping \
	'' \
	-O chromFa.tar.gz

To unpack the *.tar.gz files:
    tar xvzf <file>.tar.gz
To unpack the fa.gz files:
    gunzip <file>.fa.gz

All the tables in this directory are freely usable for any purpose. 

This file last updated: 2006-02-16 - 16 February 2006
      Name                    Last modified      Size  Description
Parent Directory - chromAgp.tar.gz 2006-02-16 10:02 421K chromFa.tar.gz 2006-02-16 11:48 803M chromFaMasked.tar.gz 2006-02-16 11:57 483M chromOut.tar.gz 2006-02-16 12:01 141M chromTrf.tar.gz 2006-02-16 12:02 17M est.fa.gz 2020-02-28 11:26 788M est.fa.gz.md5 2020-02-28 11:26 44 md5sum.txt 2008-10-16 11:43 297 mm8.2bit 2006-02-16 11:21 664M mm8.chrom.sizes 2006-02-14 13:42 564 mrna.fa.gz 2020-02-28 11:10 262M mrna.fa.gz.md5 2020-02-28 11:10 45 refMrna.fa.gz 2020-02-28 11:26 44M refMrna.fa.gz.md5 2020-02-28 11:26 48 upstream1000.fa.gz 2020-02-28 11:27 7.8M upstream1000.fa.gz.md5 2020-02-28 11:27 53 upstream2000.fa.gz 2020-02-28 11:27 15M upstream2000.fa.gz.md5 2020-02-28 11:27 53 upstream5000.fa.gz 2020-02-28 11:28 37M upstream5000.fa.gz.md5 2020-02-28 11:28 53 xenoMrna.fa.gz 2020-02-28 11:19 6.6G xenoMrna.fa.gz.md5 2020-02-28 11:19 49 xenoRefMrna.fa.gz 2020-02-28 11:26 292M xenoRefMrna.fa.gz.md5 2020-02-28 11:26 52