RepeatModeler Version 2.0.4 =========================== Using output directory = /scratch/tmp/rModeler.7qtrud/RM_2220928.SunDec80354142024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1733658853 Database = /scratch/tmp/rModeler.7qtrud/GCA_964145185.1_mMinSch1.hap2.1 - Sequences = 168 - Bases = 1693881420 - N50 = 91780633 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 189659728-203206780 | [ 1 ] 176112676-189659728 | [ 1 ] 162565624-176112676 | [ ] 149018572-162565624 | [ ] 135471520-149018572 | [ ] 121924468-135471520 | [ ] 108377416-121924468 | [ ] 94830364-108377416 |* [ 3 ] 81283312-94830364 |* [ 5 ] 67736260-81283312 | [ 2 ] 54189208-67736260 | [ 2 ] 40642156-54189208 |* [ 3 ] 27095104-40642156 |* [ 3 ] 13548052-27095104 | [ 1 ] 1000-13548052 |************************************************** [ 147 ] Storage Throughput = excellent ( 1189.66 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40023870 bp ( 40018070 non ambiguous ) - Num Contigs Represented = 37 - Sequence extraction : 00:01:03 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:07:46 (hh:mm:ss) Elapsed Time Round Time: 00:11:36 (hh:mm:ss) Elapsed Time : 182 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:17 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:23 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 8456 repeats masked totaling 1579392 bp(s). - TE Masking time 00:00:03 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10014080 bp Num Contigs Represented = 26 Non ambiguous bp: Initial: 10012280 bp After Masking: 8223111 bp Masked: 17.87 % -- Input Database Coverage: 10014080 bp out of 1693881420 bp ( 0.59 % ) Sampling Time: 00:00:43 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 31375 Comparison Time: 00:03:09 (hh:mm:ss) Elapsed Time, 7735 HSPs Collected Number of families returned by RECON: 1014 Round Time: 00:04:05 (hh:mm:ss) Elapsed Time : 23 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:46 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:01:59 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 29858 repeats masked totaling 5538732 bp(s). - TE Masking time 00:00:06 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30009710 bp Num Contigs Represented = 33 Non ambiguous bp: Initial: 30005710 bp After Masking: 23256953 bp Masked: 22.49 % -- Input Database Coverage: 40023790 bp out of 1693881420 bp ( 2.36 % ) Sampling Time: 00:02:52 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 280875 Comparison Time: 00:14:59 (hh:mm:ss) Elapsed Time, 37109 HSPs Collected Number of families returned by RECON: 2365 Round Time: 00:18:17 (hh:mm:ss) Elapsed Time : 62 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:02:19 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:04:50 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 98732 repeats masked totaling 18898319 bp(s). - TE Masking time 00:00:22 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90031155 bp Num Contigs Represented = 47 Non ambiguous bp: Initial: 90020914 bp After Masking: 68298614 bp Masked: 24.13 % -- Input Database Coverage: 130054945 bp out of 1693881420 bp ( 7.68 % ) Sampling Time: 00:07:34 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2539131 Comparison Time: 01:27:07 (hh:mm:ss) Elapsed Time, 277497 HSPs Collected Number of families returned by RECON: 9531 Round Time: 01:36:44 (hh:mm:ss) Elapsed Time : 161 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:06:49 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:13:18 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 327883 repeats masked totaling 63971487 bp(s). - TE Masking time 00:01:28 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 270065957 bp Num Contigs Represented = 81 Non ambiguous bp: Initial: 270026646 bp After Masking: 199044157 bp Masked: 26.29 % -- Input Database Coverage: 400120902 bp out of 1693881420 bp ( 23.62 % ) Sampling Time: 00:21:45 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 22899528 Comparison Time: 09:35:02 (hh:mm:ss) Elapsed Time, 1154444 HSPs Collected Number of families returned by RECON: 44011 Round Time: 10:11:05 (hh:mm:ss) Elapsed Time : 353 families discovered. RepeatScout/RECON discovery complete: 781 families found Classification Time: 00:12:40 (hh:mm:ss) Elapsed Time Program Time: 12:34:27 (hh:mm:ss) Elapsed Time