RepeatModeler Version 2.0.4 =========================== Using output directory = /dev/shm/rModeler.xWGQmq/RM_16981.MonAug121347302024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1723495649 Database = /dev/shm/rModeler.xWGQmq/GCA_000180835.1_ASM18083v1 - Sequences = 41786 - Bases = 13270809 - N50 = 442 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 7459-7985 | [ 1 ] 6933-7458 | [ ] 6408-6933 | [ ] 5882-6407 | [ ] 5356-5881 | [ 3 ] 4831-5356 | [ ] 4305-4830 | [ 2 ] 3779-4304 | [ 10 ] 3254-3779 | [ 17 ] 2728-3253 | [ 42 ] 2202-2727 | [ 113 ] 1677-2202 | [ 304 ] 1151-1676 |* [ 917 ] 625-1150 |**** [ 3212 ] 100-625 |************************************************** [ 37165 ] WARN: The N50 for this assembly is low ( <10,000 ). The de novo methods employed by RepeatModeler are intended for use with long contiguous sequences and may not perform well with an over-abundance of short contigs in the database. Storage Throughput = good ( 916.87 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 13270676 bp ( 13270676 non ambiguous ) - Num Contigs Represented = 41785 - Sequence extraction : 00:00:05 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:10:57 (hh:mm:ss) Elapsed Time Round Time: 00:12:25 (hh:mm:ss) Elapsed Time : 44 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:04 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:27 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 5640 repeats masked totaling 622145 bp(s). - TE Masking time 00:00:07 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10000147 bp Num Contigs Represented = 31327 Non ambiguous bp: Initial: 10000147 bp After Masking: 9363876 bp Masked: 6.36 % -- Input Database Coverage: 10000147 bp out of 13270809 bp ( 75.35 % ) Sampling Time: 00:00:41 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 490674801 Comparison Time: 01:48:41 (hh:mm:ss) Elapsed Time, 32931 HSPs Collected Number of families returned by RECON: 2761 Round Time: 01:52:32 (hh:mm:ss) Elapsed Time : 73 families discovered. - Increasing sample size to include end piece now = 33270809 RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 33270809 bp - Sequence extraction : 00:00:01 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:09 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 3143 repeats masked totaling 563415 bp(s). - TE Masking time 00:00:06 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 3270528 bp Num Contigs Represented = 10458 Non ambiguous bp: Initial: 3270528 bp After Masking: 2701576 bp Masked: 17.40 % -- Input Database Coverage: 13270675 bp out of 13270809 bp ( 100.00 % ) Sampling Time: 00:00:17 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 54679653 Comparison Time: 00:32:34 (hh:mm:ss) Elapsed Time, 719 HSPs Collected Number of families returned by RECON: 397 Round Time: 00:32:53 (hh:mm:ss) Elapsed Time : 0 families discovered. RepeatScout/RECON discovery complete: 117 families found Classification Time: 00:04:43 (hh:mm:ss) Elapsed Time Program Time: 02:42:34 (hh:mm:ss) Elapsed Time