RepeatModeler Version 2.0.4 =========================== Using output directory = /dev/shm/rModeler.gwjrmo/RM_29819.MonSep161947282024 Search Engine = rmblast 2.13.0+ Threads = 32 Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.4 LTR Structural Analysis: Disabled [use -LTRStruct to enable] Random Number Seed: 1726541247 Database = /dev/shm/rModeler.gwjrmo/GCA_000365475.1_EHA_CA_v1 - Sequences = 1685 - Bases = 12292336 - N50 = 9683 - Contig Histogram: Size(bp) Count ----------------------------------------------------------------------- 84096-90019 | [ 1 ] 78173-84095 | [ ] 72250-78172 | [ ] 66328-72250 | [ ] 60405-66327 | [ ] 54482-60404 | [ 1 ] 48559-54481 | [ ] 42637-48559 | [ 2 ] 36714-42636 | [ 2 ] 30791-36713 | [ 12 ] 24868-30790 |* [ 24 ] 18946-24868 |** [ 49 ] 13023-18945 |****** [ 131 ] 7100-13022 |***************** [ 374 ] 1178-7100 |************************************************** [ 1089 ] WARN: The N50 for this assembly is low ( <10,000 ). The de novo methods employed by RepeatModeler are intended for use with long contiguous sequences and may not perform well with an over-abundance of short contigs in the database. Storage Throughput = excellent ( 1086.15 MB/s ) RepeatModeler Round # 1 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 40000000 bp - Final Sample Size = 12292282 bp ( 12285355 non ambiguous ) - Num Contigs Represented = 1685 - Sequence extraction : 00:00:02 (hh:mm:ss) Elapsed Time -- Running RepeatScout on the sequences... - RepeatScout: 00:10:36 (hh:mm:ss) Elapsed Time Round Time: 00:11:26 (hh:mm:ss) Elapsed Time : 18 families discovered. RepeatModeler Round # 2 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 10000000 bp - Sequence extraction : 00:00:01 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:48 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 623 repeats masked totaling 107306 bp(s). - TE Masking time 00:00:03 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 10013889 bp Num Contigs Represented = 1370 Non ambiguous bp: Initial: 10006962 bp After Masking: 9873954 bp Masked: 1.33 % -- Input Database Coverage: 10013889 bp out of 12292336 bp ( 81.46 % ) Sampling Time: 00:00:54 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 940506 Comparison Time: 00:28:09 (hh:mm:ss) Elapsed Time, 5629 HSPs Collected Number of families returned by RECON: 1053 Round Time: 00:29:50 (hh:mm:ss) Elapsed Time : 5 families discovered. - Increasing sample size to include end piece now = 32292336 RepeatModeler Round # 3 ======================== Searching for Repeats -- Sampling from the database... - Gathering up to 32292336 bp - Sequence extraction : 00:00:00 (hh:mm:ss) Elapsed Time -- Running TRFMask on the sequence... - TRFMask time 00:00:11 (hh:mm:ss) Elapsed Time -- Masking repeats from the previous rounds... 230 repeats masked totaling 51691 bp(s). - TE Masking time 00:00:06 (hh:mm:ss) Elapsed Time -- Sample Stats: Sample Size 2278380 bp Num Contigs Represented = 318 Non ambiguous bp: Initial: 2278380 bp After Masking: 2221257 bp Masked: 2.51 % -- Input Database Coverage: 12292269 bp out of 12292336 bp ( 100.00 % ) Sampling Time: 00:00:17 (hh:mm:ss) Elapsed Time Running all-by-other comparisons... - Total Comparisons = 50403 Comparison Time: 00:02:52 (hh:mm:ss) Elapsed Time, 146 HSPs Collected Number of families returned by RECON: 97 Round Time: 00:03:09 (hh:mm:ss) Elapsed Time : 0 families discovered. RepeatScout/RECON discovery complete: 23 families found Classification Time: 00:00:59 (hh:mm:ss) Elapsed Time Program Time: 00:45:24 (hh:mm:ss) Elapsed Time