Description

CHM13 PRO-seq (Precision Run-On sequencing) Bowtie2 alignments to CHM13v2.0 (minus chrY) and unique genome-wide 21mer filtering (stranded)

PRO-seq detects nascent transcription (including from non-coding) from RNA polymerases with nucleotide resolution at genome-scale.

Methods

The PRO-seq experiment was done on CHM13 cells (in duplicate, A and B) and sequenced from the 3' end for 75bp single-ended reads. Reads were adapter trimmed, quality filtered (-q 20), and length filtered (-m 20) with Cutadapt (v2.7). Trimmed reads were then reverse complemented since they were sequenced in the 3'-->5' direction. D. melanogaster spike-ins were removed with Bowtie2 (v2.3.5.1) and samtools view -f 4 (v1.9). Reads were then mapped with either Bowtie2 (v2.3.5.1) default or -k 100 (allowing up to 100 multi-mappers). Unique genome-wide 21mers were generated through Meryl (https://github.com/marbl/meryl). The reads mapped with -k 100 were filtered with these unique genome-wide 21mers through one of two methods:

  1. Locus-specific unique genome-wide 21mer filtering (overlapSelect -overlapBases=21; UCSC tools (GenomeBrowser/20180626))
  2. Read- and locus-specific unique genome-wide 21mer filtering (https://github.com/arangrhie/T2T-Polish/tree/master/marker_assisted, overlapSelect -overlapBases=21)

Display Conventions and Configuration

The tracks labeled as "neg" are negated for viewing.

Data access

Raw PRO-seq data filled under BioProject PRJNA559484

Release history

  1. CHM13v2.0 assembly (minus chrY)

Credits

References

For generation of unique genome-wide 21mers:

For the PRO-seq experimental, mapping, and filtering methods: