RepeatModeler Version 2.0.4 =========================== Using output directory = /data/tmp/rModeler.gPYcEs/RM_1974205.SunJul201217142025 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1753039032 Database = /data/tmp/rModeler.gPYcEs/GCA_964264895.2_mNycLei1.hap2.2 - Sequences = 673 - Bases = 2021249169 - N50 = 91995108 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 211435344-226537798 | [ 1 ] 196332891-211435344 | [ 2 ] 181230438-196332891 | [ ] 166127985-181230438 | [ ] 151025532-166127985 | [ ] 135923078-151025531 | [ ] 120820625-135923078 | [ ] 105718172-120820625 | [ 2 ] 90615719-105718172 | [ 2 ] 75513266-90615719 | [ 3 ] 60410812-75513265 | [ 2 ] 45308359-60410812 | [ 5 ] 30205906-45308359 | [ 1 ] 15103453-30205906 | [ 2 ] 1000-15103453 |************************************************** [ 653 ] Storage Throughput = excellent ( 1951.29 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40021314 bp ( 40020914 non ambiguous ) - Num Contigs Represented = 114 - Sequence extraction : 00:00:59 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:10:30 (hh:mm:ss) Elapsed Time Round Time: 00:15:39 (hh:mm:ss) Elapsed Time : 360 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:14 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:58 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 14031 repeats masked totaling 2680549 bp(s). - TE Masking time 00:00:09 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10018914 bp Num Contigs Represented = 52 Non ambiguous bp: Initial: 10018714 bp After Masking: 6269802 bp Masked: 37.42 % -- Input Database Coverage: 10018914 bp out of 2021249169 bp ( 0.50 % ) Sampling Time: 00:01:21 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 31626 Comparison Time: 00:03:32 (hh:mm:ss) Elapsed Time, 19340 HSPs Collected Number of families returned by RECON: 603 Round Time: 00:05:02 (hh:mm:ss) Elapsed Time : 10 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:42 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:03:14 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 46278 repeats masked totaling 8534110 bp(s). - TE Masking time 00:00:25 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30002398 bp Num Contigs Represented = 91 Non ambiguous bp: Initial: 30002198 bp After Masking: 18781855 bp Masked: 37.40 % -- Input Database Coverage: 40021312 bp out of 2021249169 bp ( 1.98 % ) Sampling Time: 00:04:23 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 282376 Comparison Time: 00:15:17 (hh:mm:ss) Elapsed Time, 20659 HSPs Collected Number of families returned by RECON: 2160 Round Time: 00:20:01 (hh:mm:ss) Elapsed Time : 42 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:02:15 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:11:02 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 145006 repeats masked totaling 26736015 bp(s). - TE Masking time 00:01:22 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90011291 bp Num Contigs Represented = 196 Non ambiguous bp: Initial: 90006491 bp After Masking: 54967447 bp Masked: 38.93 % -- Input Database Coverage: 130032603 bp out of 2021249169 bp ( 6.43 % ) Sampling Time: 00:14:43 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2561716 Comparison Time: 01:22:06 (hh:mm:ss) Elapsed Time, 105318 HSPs Collected Number of families returned by RECON: 7417 Round Time: 01:41:37 (hh:mm:ss) Elapsed Time : 201 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:06:25 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:29:28 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 466867 repeats masked totaling 87174086 bp(s). - TE Masking time 00:05:12 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 270040607 bp Num Contigs Represented = 321 Non ambiguous bp: Initial: 270035007 bp After Masking: 157707996 bp Masked: 41.60 % -- Input Database Coverage: 400073210 bp out of 2021249169 bp ( 19.79 % ) Sampling Time: 00:41:15 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 23082615 Comparison Time: 09:09:24 (hh:mm:ss) Elapsed Time, 317722 HSPs Collected Number of families returned by RECON: 29737 Round Time: 10:00:14 (hh:mm:ss) Elapsed Time : 444 families discovered. RepeatScout/RECON discovery complete: 1057 families found Classification Time: 00:33:52 (hh:mm:ss) Elapsed Time Program Time: 12:56:25 (hh:mm:ss) Elapsed Time