RepeatModeler Version 2.0.7 =========================== Using output directory = /dev/shm/rModeler.vo1X2S/RM_526487.ThuSep41255172025 Search Engine = rmblast 2.14.1+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.7, RepeatMasker 4.2.1, RepeatAfterMe 0.0.7 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1757015716 Database = /dev/shm/rModeler.vo1X2S/GCA_043727955.1_ASM4372795v1 - Sequences = 250 - Bases = 2380748055 - N50 = 183812396 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 208028781-222887611 | [ 3 ] 193169951-208028780 | [ 1 ] 178311121-193169950 | [ 1 ] 163452291-178311120 | [ 1 ] 148593461-163452290 | [ ] 133734631-148593460 | [ 2 ] 118875801-133734630 | [ 1 ] 104016971-118875800 | [ 1 ] 89158141-104016970 | [ 1 ] 74299311-89158140 | [ 3 ] 59440481-74299310 | [ 1 ] 44581651-59440480 | [ 1 ] 29722821-44581650 | [ ] 14863991-29722820 | [ ] 5162-14863991 |************************************************** [ 234 ] Storage Throughput = excellent ( 1124.04 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40024531 bp ( 40020791 non ambiguous ) - Num Contigs Represented = 71 - Sequence extraction : 00:02:43 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: Running build_lmer_table ( l = 14, min = 10 ).. - RepeatScout: Running RepeatScout.. : 176 raw families identified - RepeatScout: Running filtering stage.. 159 families remaining - RepeatScout: 00:09:39 (hh:mm:ss) Elapsed Time - Collecting repeat instances... - Refining 155 families... 06:40:22 (hh:mm:ss) Elapsed Time - Redundant Families and Large Satellite Filtering.. : 10 satellite(s), 53 contained, found in 00:01:33 (hh:mm:ss) Elapsed Time Family Refinement: 00:01:33 (hh:mm:ss) Elapsed Time Round Time: 06:54:24 (hh:mm:ss) Elapsed Time : 92 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:40 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:03:05 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 6558 repeats masked totaling 1809716 bp(s). - TE Masking time 00:00:09 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10002398 bp Num Contigs Represented = 32 Non ambiguous bp: Initial: 10001898 bp After Masking: 7711787 bp Masked: 22.90 % -- Input Database Coverage: 10002398 bp out of 2380748055 bp ( 0.42 % ) Sampling Time: 00:03:56 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 31375 Comparison Time: 00:08:32 (hh:mm:ss) Elapsed Time, 141771 HSPs Collected Number of families returned by RECON: 1245 Round Time: 00:22:36 (hh:mm:ss) Elapsed Time : 26 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:02:03 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:06:38 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 27699 repeats masked totaling 7476974 bp(s). - TE Masking time 00:00:28 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30022053 bp Num Contigs Represented = 64 Non ambiguous bp: Initial: 30018813 bp After Masking: 21335415 bp Masked: 28.93 % -- Input Database Coverage: 40024451 bp out of 2380748055 bp ( 1.68 % ) Sampling Time: 00:09:13 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 282376 Comparison Time: 00:45:00 (hh:mm:ss) Elapsed Time, 263813 HSPs Collected Number of families returned by RECON: 2017 Round Time: 00:55:07 (hh:mm:ss) Elapsed Time : 59 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:06:04 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:17:26 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 100348 repeats masked totaling 26848280 bp(s). - TE Masking time 00:01:50 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90042916 bp Num Contigs Represented = 127 Non ambiguous bp: Initial: 90039476 bp After Masking: 59957466 bp Masked: 33.41 % -- Input Database Coverage: 130067367 bp out of 2380748055 bp ( 5.46 % ) Sampling Time: 00:25:31 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2545896 Comparison Time: 03:55:39 (hh:mm:ss) Elapsed Time, 1257504 HSPs Collected Number of families returned by RECON: 6829 Round Time: 04:31:47 (hh:mm:ss) Elapsed Time : 138 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:18:35 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:50:31 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 338748 repeats masked totaling 89993562 bp(s). - TE Masking time 00:12:08 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 270028567 bp Num Contigs Represented = 184 Non ambiguous bp: Initial: 270019227 bp After Masking: 171262947 bp Masked: 36.57 % -- Input Database Coverage: 400095934 bp out of 2380748055 bp ( 16.81 % ) Sampling Time: 01:21:46 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 22865703 Comparison Time: 30:20:52 (hh:mm:ss) Elapsed Time, 22397111 HSPs Collected Number of families returned by RECON: 28843 Round Time: 32:27:15 (hh:mm:ss) Elapsed Time : 363 families discovered. RepeatScout/RECON discovery complete: 678 families found # # RepeatClassifier # # Version 2.0.7 # Threads: 32 # Current Working Directory: /dev/shm/rModeler.vo1X2S/RM_526487.ThuSep41255172025 # Protein Library: /hive/data/outside/RepeatMasker/RepeatMasker-4.2.1/Libraries/RepeatPeps.lib # - 18011 proteins # Consensi Library: /hive/data/outside/RepeatMasker/RepeatMasker-4.2.1/Libraries/RepeatMasker.lib # - 26292 consensus sequences - Looking for simple/tandem and low complexity sequences.. - Looking for similarity to known repeat proteins.. - Looking for similarity to known repeat consensi.. Classification Time: 00:05:37 (hh:mm:ss) Elapsed Time Program Time: 45:16:46 (hh:mm:ss) Elapsed Time