RepeatModeler Version 2.0.7 =========================== Using output directory = /dev/shm/rModeler.4GeImV/RM_1518432.FriSep122043142025 Search Engine = rmblast 2.14.1+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.7, RepeatMasker 4.2.1, RepeatAfterMe 0.0.7 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1757734994 Database = /dev/shm/rModeler.4GeImV/GCA_051397795.1_GA15_6 - Sequences = 947 - Bases = 40614788 - N50 = 122852 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 407308-436366 | [ 3 ] 378250-407307 | [ 2 ] 349193-378250 | [ 1 ] 320135-349192 | [ 1 ] 291078-320135 | [ 1 ] 262020-291077 | [ 3 ] 232962-262019 | [ 5 ] 203905-232962 |* [ 15 ] 174847-203904 |* [ 16 ] 145790-174847 |** [ 33 ] 116732-145789 |** [ 34 ] 87674-116731 |**** [ 47 ] 58617-87674 |****** [ 79 ] 29559-58616 |********** [ 126 ] 502-29559 |************************************************** [ 581 ] Storage Throughput = excellent ( 1209.00 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 40039111 bp ( 40037255 non ambiguous ) - Num Contigs Represented = 943 - Sequence extraction : 00:00:02 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: Running build_lmer_table ( l = 14, min = 10 ).. - RepeatScout: Running RepeatScout.. : 44 raw families identified - RepeatScout: Running filtering stage.. 43 families remaining - RepeatScout: 00:00:45 (hh:mm:ss) Elapsed Time - Collecting repeat instances... - Refining 41 families... 00:00:17 (hh:mm:ss) Elapsed Time - Redundant Families and Large Satellite Filtering.. : 0 satellite(s), 8 contained, found in 00:00:01 (hh:mm:ss) Elapsed Time Family Refinement: 00:00:01 (hh:mm:ss) Elapsed Time Round Time: 00:01:06 (hh:mm:ss) Elapsed Time : 33 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:01 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:05 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 419 repeats masked totaling 123020 bp(s). - TE Masking time 00:00:02 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10028347 bp Num Contigs Represented = 323 Non ambiguous bp: Initial: 10027903 bp After Masking: 9879840 bp Masked: 1.48 % -- Input Database Coverage: 10028347 bp out of 40614788 bp ( 24.69 % ) Sampling Time: 00:00:09 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 86320 Comparison Time: 00:04:31 (hh:mm:ss) Elapsed Time, 1911 HSPs Collected Number of families returned by RECON: 785 Round Time: 00:04:47 (hh:mm:ss) Elapsed Time : 1 families discovered. RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 30000000 bp - Sequence extraction : 00:00:01 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:16 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 1204 repeats masked totaling 389770 bp(s). - TE Masking time 00:00:05 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 30010694 bp Num Contigs Represented = 785 Non ambiguous bp: Initial: 30009282 bp After Masking: 29533359 bp Masked: 1.59 % -- Input Database Coverage: 40039041 bp out of 40614788 bp ( 98.58 % ) Sampling Time: 00:00:23 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 769420 Comparison Time: 00:18:04 (hh:mm:ss) Elapsed Time, 18217 HSPs Collected Number of families returned by RECON: 4156 Round Time: 00:18:56 (hh:mm:ss) Elapsed Time : 20 families discovered. - Increasing sample size to include end piece now = 90586441 RepeatModeler Round # 4 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 90586441 bp - Sequence extraction : 00:00:01 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:00 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 20 repeats masked totaling 2471 bp(s). - TE Masking time 00:00:01 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 575540 bp Num Contigs Represented = 18 Non ambiguous bp: Initial: 575493 bp After Masking: 572419 bp Masked: 0.53 % -- Input Database Coverage: 40614581 bp out of 40614788 bp ( 100.00 % ) Sampling Time: 00:00:02 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 171 Comparison Time: 00:00:12 (hh:mm:ss) Elapsed Time, 2 HSPs Collected Round Time: 00:00:14 (hh:mm:ss) Elapsed Time : 0 families discovered. RepeatScout/RECON discovery complete: 54 families found # # RepeatClassifier # # Version 2.0.7 # Threads: 32 # Current Working Directory: /dev/shm/rModeler.4GeImV/RM_1518432.FriSep122043142025 # Protein Library: /hive/data/outside/RepeatMasker/RepeatMasker-4.2.1/Libraries/RepeatPeps.lib # - 18011 proteins # Consensi Library: /hive/data/outside/RepeatMasker/RepeatMasker-4.2.1/Libraries/RepeatMasker.lib # - 26292 consensus sequences - Looking for simple/tandem and low complexity sequences.. - Looking for similarity to known repeat proteins.. - Looking for similarity to known repeat consensi.. Classification Time: 00:00:10 (hh:mm:ss) Elapsed Time Program Time: 00:25:13 (hh:mm:ss) Elapsed Time