Supplemental Information for

Evolution’s Cauldron: Duplication, Deletion, and Rearrangement in the Mouse and Human Genomes.

Breaks

Position

aligned

span

breakPerMb

Genes in region

759

chr19:52283288-55023011

355617

2739723

277.04

Sialic acid binding (ig like cluster) and Zinc finger cluster

307

chr19:11912709-12980044

123180

1067335

287.63

KRAB C2H2 Zinc finger cluster

282

chr19:57924332-58994674

163649

1070342

263.47

KRAB C2H2 Zinc finger cluster

228

chr2:129802976-130894216

284128

1091240

208.94

Ptpn18, cryptic (gene poor)

215

chr22:19036955-20094010

132206

1057055

203.4

Contains Ig lambda light chain locus and Rhabdoid tumor deletion region

212

chr19:9294092-10142345

100712

848253

249.93

KRAB C2H2 Zinc finger cluster

209

chr8:6948129-8102614

205566

1154485

181.03

Ig region (defensin cluster)

199

chr6:28560342-29380273

99588

819931

242.7

Olfactory receptor cluster

164

chr12:11050382-11876400

124173

826018

198.54

6 Taste receptors and 5 salivary gland genes

160

chr3:19755611-22829219

94791

3073608

52.06

HMGB1, mostly gene desert

158

chr1:153813433-154249175

48153

435742

362.6

CD1, immune response cluster

140

chr11:90897985-91367162

75731

469177

298.39

RNF18, gene poor region

124

chr17:21859306-22267186

116716

407880

304.01

TL132 protein, gene poor region

120

chr19:15050464-15453575

38600

403111

297.68

KRAB C2H2 Zinc finger, olfactory cluster

115

chr18:13919892-15073531

145639

1153639

99.68

2 melocortin GPCRs

 

Table S1 – Long runs of short chains in the top level of the net. "Breaks" gives the number of short chains in the region, "Position" the location in the human genome (November 2002 freeze), "aligned" the number of aligned bases in the region, "span" the total length of the region (in human genome coordinates). "BreaksPerMb" the average number of breaks between chains in each megabase in the region, and "Genes in region" lists the properties of or accessions of human RefSeq genes from the region. This table lists only the top few "hotspots" with the most consecutive short chains, excluding pericentormeric and telomeric regions , matches to human chromosome Y (since little sequence is available from mouse chromosome Y), and one region with extensive simple repeats of longer period than we masked. Hotspots of this type cover 5-10% of the human genome. Some are associated with clusters of mobile genes in the olfactory or zinc finger families (see text). Others are associated with immune genes. One of the largest of these regions (5th entry in the table) is on chromosome 22 near the Ig Lambda light chain locus. This locus rearranges somatically during B-cell maturation . Possibly a low level of the RAG proteins responsible for this somatic rearrangement are expressed in germ cells and occasionally cause recombination in germ cell lines in this region in mammals as well as other vertebrates . There are a number of RAG ESTs isolated from cells not in the B-cell lineage, including GenBank accession N92955.