RepeatModeler Version 2.0.7 =========================== Using output directory = /dev/shm/rModeler.bL0vrQ/RM_1163731.TueSep21230072025 Search Engine = rmblast 2.14.1+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.7, RepeatMasker 4.2.1, RepeatAfterMe 0.0.7 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1756841406 Database = /dev/shm/rModeler.bL0vrQ/GCA_018852605.3_hg002v1.1.pat - Sequences = 23 - Bases = 2947895164 Storage Throughput = excellent ( 1110.70 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40000976 bp ( 40000976 non ambiguous ) - Num Contigs Represented = 23 - Sequence extraction : 00:00:56 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: Running build_lmer_table ( l = 14, min = 10 ).. - RepeatScout: Running RepeatScout.. : 187 raw families identified - RepeatScout: Running filtering stage.. 170 families remaining - RepeatScout: 00:08:09 (hh:mm:ss) Elapsed Time - Collecting repeat instances... - Refining 165 families... 00:03:09 (hh:mm:ss) Elapsed Time - Redundant Families and Large Satellite Filtering.. : 3 satellite(s), 71 contained, found in 00:00:02 (hh:mm:ss) Elapsed Time Family Refinement: 00:00:03 (hh:mm:ss) Elapsed Time Round Time: 00:12:23 (hh:mm:ss) Elapsed Time : 91 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:14 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:02:13 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 10886 repeats masked totaling 2698478 bp(s). - TE Masking time 00:00:08 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10000159 bp Num Contigs Represented = 22 Non ambiguous bp: Initial: 10000159 bp After Masking: 6519936 bp Masked: 34.80 % -- Input Database Coverage: 10000159 bp out of 2947895164 bp ( 0.34 % ) Sampling Time: 00:02:37 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 31125 Comparison Time: 00:07:07 (hh:mm:ss) Elapsed Time, 20735 HSPs Collected Number of families returned by RECON: 1268 Round Time: 00:10:26 (hh:mm:ss) Elapsed Time : 21 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:42 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:05:40 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 48825 repeats masked totaling 9123304 bp(s). - TE Masking time 00:00:25 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30000737 bp Num Contigs Represented = 23 Non ambiguous bp: Initial: 30000737 bp After Masking: 18511341 bp Masked: 38.30 % -- Input Database Coverage: 40000896 bp out of 2947895164 bp ( 1.36 % ) Sampling Time: 00:06:51 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 280875 Comparison Time: 00:33:49 (hh:mm:ss) Elapsed Time, 44738 HSPs Collected Number of families returned by RECON: 2237 Round Time: 01:13:41 (hh:mm:ss) Elapsed Time : 87 families discovered. RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90000000 bp - Sequence extraction : 00:02:08 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:16:29 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 133037 repeats masked totaling 32725909 bp(s). - TE Masking time 00:02:01 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 90322193 bp Num Contigs Represented = 23 Non ambiguous bp: Initial: 90002171 bp After Masking: 50597095 bp Masked: 43.78 % -- Input Database Coverage: 130323089 bp out of 2947895164 bp ( 4.42 % ) Sampling Time: 00:20:49 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 2548153 Comparison Time: 03:01:22 (hh:mm:ss) Elapsed Time, 83594 HSPs Collected Number of families returned by RECON: 5394 Round Time: 03:24:59 (hh:mm:ss) Elapsed Time : 150 families discovered. RepeatModeler Round # 5 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 270000000 bp - Sequence extraction : 00:05:39 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:48:38 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 434797 repeats masked totaling 105909650 bp(s). - TE Masking time 00:08:59 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 270835613 bp Num Contigs Represented = 23 Non ambiguous bp: Initial: 270035554 bp After Masking: 144543952 bp Masked: 46.47 % -- Input Database Coverage: 401158702 bp out of 2947895164 bp ( 13.61 % ) Sampling Time: 01:03:50 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 22919835 Comparison Time: 23:09:17 (hh:mm:ss) Elapsed Time, 220261 HSPs Collected Number of families returned by RECON: 22453 Round Time: 24:32:36 (hh:mm:ss) Elapsed Time : 328 families discovered. RepeatScout/RECON discovery complete: 677 families found # # RepeatClassifier # # Version 2.0.7 # Threads: 32 # Current Working Directory: /dev/shm/rModeler.bL0vrQ/RM_1163731.TueSep21230072025 # Protein Library: /hive/data/outside/RepeatMasker/RepeatMasker-4.2.1/Libraries/RepeatPeps.lib # - 18011 proteins # Consensi Library: /hive/data/outside/RepeatMasker/RepeatMasker-4.2.1/Libraries/RepeatMasker.lib # - 26292 consensus sequences - Looking for simple/tandem and low complexity sequences.. - Looking for similarity to known repeat proteins.. - Looking for similarity to known repeat consensi.. Classification Time: 00:04:20 (hh:mm:ss) Elapsed Time Program Time: 29:38:25 (hh:mm:ss) Elapsed Time