Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 20 Rules claiming more threads will be scaled down. Job counts: count jobs 1 all 1 annotate_all_sjs 2 annotate_study_sjs 2 collect_merged_sjs 1 collect_per_study_sjs 1 collect_study_merged_sjs 2 extract_motifs_for_sjs 2 filter_sjs 1 final_recount_output 1 find_sjs 1 merge_all_sjs 1 merge_per_study_sjs 4 merge_sjs 2 merge_study_sjs 2 mmformat_sjs 24 [Mon Dec 14 21:03:23 2020] rule find_sjs: input: links, ids.tsv output: unified_jxs/sj.groups.manifest jobid: 23 /storage2/cwilks/recount-unify/scripts/find_new_files.sh links ids.tsv unified_jxs sj "*.zst" per_study [Mon Dec 14 21:03:33 2020] Finished job 23. 1 of 24 steps (4%) done [Mon Dec 14 21:03:33 2020] rule filter_sjs: input: unified_jxs/sj.groups.manifest output: unified_jxs/sj.lieber_phase2_hippo.00.manifest.filtered jobid: 21 wildcards: study=lieber_phase2_hippo, run_group_num=00 /storage2/cwilks/recount-unify/scripts/filter_new_sjs.sh unified_jxs/sj.lieber_phase2_hippo.00.manifest -1 [Mon Dec 14 21:03:33 2020] rule filter_sjs: input: unified_jxs/sj.groups.manifest output: unified_jxs/sj.lieber_phase2_hippo.08.manifest.filtered jobid: 22 wildcards: study=lieber_phase2_hippo, run_group_num=08 /storage2/cwilks/recount-unify/scripts/filter_new_sjs.sh unified_jxs/sj.lieber_phase2_hippo.08.manifest -1 [Mon Dec 14 21:03:42 2020] Finished job 22. 2 of 24 steps (8%) done [Mon Dec 14 21:03:42 2020] rule merge_sjs: input: unified_jxs/sj.lieber_phase2_hippo.08.manifest.filtered output: unified_jxs/sj.lieber_phase2_hippo.08.unique.merged jobid: 20 wildcards: study=lieber_phase2_hippo, run_group_num=08, type=unique threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.lieber_phase2_hippo.08.manifest --coverage-col 3 > unified_jxs/sj.lieber_phase2_hippo.08.unique.merged [Mon Dec 14 21:03:42 2020] rule merge_sjs: input: unified_jxs/sj.lieber_phase2_hippo.08.manifest.filtered output: unified_jxs/sj.lieber_phase2_hippo.08.all.merged jobid: 18 wildcards: study=lieber_phase2_hippo, run_group_num=08, type=all threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.lieber_phase2_hippo.08.manifest --coverage-col 4 > unified_jxs/sj.lieber_phase2_hippo.08.all.merged [Mon Dec 14 21:03:50 2020] Finished job 20. 3 of 24 steps (12%) done [Mon Dec 14 21:03:50 2020] Finished job 18. 4 of 24 steps (17%) done [Mon Dec 14 21:14:34 2020] Finished job 21. 5 of 24 steps (21%) done [Mon Dec 14 21:14:34 2020] rule merge_sjs: input: unified_jxs/sj.lieber_phase2_hippo.00.manifest.filtered output: unified_jxs/sj.lieber_phase2_hippo.00.all.merged jobid: 17 wildcards: study=lieber_phase2_hippo, run_group_num=00, type=all threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.lieber_phase2_hippo.00.manifest --coverage-col 4 > unified_jxs/sj.lieber_phase2_hippo.00.all.merged [Mon Dec 14 21:14:34 2020] rule merge_sjs: input: unified_jxs/sj.lieber_phase2_hippo.00.manifest.filtered output: unified_jxs/sj.lieber_phase2_hippo.00.unique.merged jobid: 19 wildcards: study=lieber_phase2_hippo, run_group_num=00, type=unique threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.lieber_phase2_hippo.00.manifest --coverage-col 3 > unified_jxs/sj.lieber_phase2_hippo.00.unique.merged [Mon Dec 14 21:38:15 2020] Finished job 19. 6 of 24 steps (25%) done [Mon Dec 14 21:38:15 2020] rule collect_merged_sjs: input: unified_jxs/sj.lieber_phase2_hippo.00.unique.merged, unified_jxs/sj.lieber_phase2_hippo.08.unique.merged output: unified_jxs/sj.lieber_phase2_hippo.unique.groups.merged.files.list jobid: 16 wildcards: study=lieber_phase2_hippo, type=unique ls unified_jxs/sj.lieber_phase2_hippo.??.unique.merged > unified_jxs/sj.lieber_phase2_hippo.unique.groups.merged.files.list [Mon Dec 14 21:38:21 2020] Finished job 16. 7 of 24 steps (29%) done [Mon Dec 14 21:38:21 2020] rule merge_study_sjs: input: unified_jxs/sj.lieber_phase2_hippo.unique.groups.merged.files.list output: unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged jobid: 14 wildcards: study=lieber_phase2_hippo, type=unique threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.lieber_phase2_hippo.unique.groups.merged.files.list --append-samples > unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged [Mon Dec 14 21:38:35 2020] Finished job 14. 8 of 24 steps (33%) done [Mon Dec 14 21:38:35 2020] rule extract_motifs_for_sjs: input: unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged, /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv, /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.fa output: unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged.motifs jobid: 12 wildcards: study=lieber_phase2_hippo, type=unique cat unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged | /storage2/cwilks/recount-unify/merge/perbase -c /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv -g /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.fa -f /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv > unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged.motifs 2>unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged.motifs.errs [Mon Dec 14 21:38:53 2020] Finished job 12. 9 of 24 steps (38%) done [Mon Dec 14 21:38:53 2020] rule annotate_study_sjs: input: unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged.motifs output: junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated jobid: 9 wildcards: study=lieber_phase2_hippo, type=unique cat unified_jxs/all.lieber_phase2_hippo.unique.sjs.merged.motifs | pypy /storage2/cwilks/recount-unify/annotate/annotate_sjs.py --compiled-annotations /storage2/cwilks/recount-pump-refs/hg38_unify/annotated_junctions.tsv.gz --compilation-id 0 | cut -f 2- > junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated [Mon Dec 14 21:40:01 2020] Finished job 9. 10 of 24 steps (42%) done [Mon Dec 14 21:40:01 2020] rule mmformat_sjs: input: junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated output: junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.MM.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.RR.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.ID.gz jobid: 6 wildcards: study=lieber_phase2_hippo, type=unique threads: 2 if [[ -s junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated ]]; then cut -f 11 junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated | tr , \\n | fgrep ':' | cut -d':' -f1 | sort -nu > junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated.sids num_samples=`cat junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated.sids | wc -l` cat junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated | /storage2/cwilks/recount-unify/scripts/mmformat -n ${num_samples} -p "lieber_phase2_hippo.unique" -s junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated.sids > lieber_phase2_hippo.unique.RR 2> junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.MM.gz.run mv lieber_phase2_hippo.unique.mm lieber_phase2_hippo.unique.MM cat lieber_phase2_hippo.unique.MM | pigz --fast -p 2 > junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.MM.gz cat lieber_phase2_hippo.unique.RR | pigz --fast -p 2 > junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.RR.gz cat <(echo "rail_id") junction_counts_per_study/lieber_phase2_hippo.unique.sj.merged.motifs.annotated.sids | pigz --fast -p 2 > junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.ID.gz else touch junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.MM.gz junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.RR.gz fi [Mon Dec 14 21:40:27 2020] Finished job 17. 11 of 24 steps (46%) done [Mon Dec 14 21:40:27 2020] rule collect_merged_sjs: input: unified_jxs/sj.lieber_phase2_hippo.00.all.merged, unified_jxs/sj.lieber_phase2_hippo.08.all.merged output: unified_jxs/sj.lieber_phase2_hippo.all.groups.merged.files.list jobid: 15 wildcards: study=lieber_phase2_hippo, type=all ls unified_jxs/sj.lieber_phase2_hippo.??.all.merged > unified_jxs/sj.lieber_phase2_hippo.all.groups.merged.files.list [Mon Dec 14 21:40:27 2020] Finished job 15. 12 of 24 steps (50%) done [Mon Dec 14 21:40:27 2020] rule merge_study_sjs: input: unified_jxs/sj.lieber_phase2_hippo.all.groups.merged.files.list output: unified_jxs/all.lieber_phase2_hippo.all.sjs.merged jobid: 13 wildcards: study=lieber_phase2_hippo, type=all threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.lieber_phase2_hippo.all.groups.merged.files.list --append-samples > unified_jxs/all.lieber_phase2_hippo.all.sjs.merged [Mon Dec 14 21:40:40 2020] Finished job 13. 13 of 24 steps (54%) done [Mon Dec 14 21:40:40 2020] rule extract_motifs_for_sjs: input: unified_jxs/all.lieber_phase2_hippo.all.sjs.merged, /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv, /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.fa output: unified_jxs/all.lieber_phase2_hippo.all.sjs.merged.motifs jobid: 11 wildcards: study=lieber_phase2_hippo, type=all cat unified_jxs/all.lieber_phase2_hippo.all.sjs.merged | /storage2/cwilks/recount-unify/merge/perbase -c /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv -g /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.fa -f /storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv > unified_jxs/all.lieber_phase2_hippo.all.sjs.merged.motifs 2>unified_jxs/all.lieber_phase2_hippo.all.sjs.merged.motifs.errs [Mon Dec 14 21:41:00 2020] Finished job 11. 14 of 24 steps (58%) done [Mon Dec 14 21:41:00 2020] rule collect_per_study_sjs: input: unified_jxs/all.lieber_phase2_hippo.all.sjs.merged.motifs output: unified_jxs/sj.po.manifest jobid: 10 wildcards: study_low_order=po ls unified_jxs/all.*po.all.sjs.merged.motifs > unified_jxs/sj.po.manifest [Mon Dec 14 21:41:00 2020] rule annotate_study_sjs: input: unified_jxs/all.lieber_phase2_hippo.all.sjs.merged.motifs output: junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated jobid: 8 wildcards: study=lieber_phase2_hippo, type=all cat unified_jxs/all.lieber_phase2_hippo.all.sjs.merged.motifs | pypy /storage2/cwilks/recount-unify/annotate/annotate_sjs.py --compiled-annotations /storage2/cwilks/recount-pump-refs/hg38_unify/annotated_junctions.tsv.gz --compilation-id 0 | cut -f 2- > junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated [Mon Dec 14 21:41:00 2020] Finished job 10. 15 of 24 steps (62%) done [Mon Dec 14 21:41:00 2020] rule merge_per_study_sjs: input: unified_jxs/sj.po.manifest output: unified_jxs/sj.po.motifs.merged.tsv jobid: 7 wildcards: study_low_order=po threads: 8 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.po.manifest --motif-correction 6 --append-samples > unified_jxs/sj.po.motifs.merged.tsv [Mon Dec 14 21:41:16 2020] Finished job 7. 16 of 24 steps (67%) done [Mon Dec 14 21:41:16 2020] rule collect_study_merged_sjs: input: unified_jxs/sj.po.motifs.merged.tsv output: unified_jxs/sj.all.merged.files.list jobid: 4 ls unified_jxs/sj.??.motifs.merged.tsv > unified_jxs/sj.all.merged.files.list [Mon Dec 14 21:41:17 2020] Finished job 4. 17 of 24 steps (71%) done [Mon Dec 14 21:41:17 2020] rule merge_all_sjs: input: unified_jxs/sj.all.merged.files.list output: all.sjs.motifs.merged.tsv jobid: 1 pypy /storage2/cwilks/recount-unify/merge/merge.py --list-file unified_jxs/sj.all.merged.files.list --append-samples --existing-sj-db "" > all.sjs.motifs.merged.tsv [Mon Dec 14 21:41:31 2020] Finished job 1. 18 of 24 steps (75%) done [Mon Dec 14 21:41:31 2020] rule annotate_all_sjs: input: all.sjs.motifs.merged.tsv output: junctions.bgz, junctions.bgz.tbi, junctions.sqlite, samples.tsv, lucene_indexed_numeric_types.tsv, lucene_full_standard, lucene_full_ws, samples.fields.tsv jobid: 2 threads: 7 if [[ "True" == "True" ]]; then rm -f jx_sqlite_import mkfifo jx_sqlite_import sqlite3 junctions.sqlite < /storage2/cwilks/recount-unify/snaptron/deploy/snaptron_schema.sql cat all.sjs.motifs.merged.tsv | pypy /storage2/cwilks/recount-unify/annotate/annotate_sjs.py --compiled-annotations /storage2/cwilks/recount-pump-refs/hg38_unify/annotated_junctions.tsv.gz --motif-correct --compilation-id 102 | tee jx_sqlite_import | bgzip -@ 7 > junctions.bgz & sqlite3 junctions.sqlite -cmd '.separator " "' ".import ./jx_sqlite_import intron" sqlite3 junctions.sqlite < /storage2/cwilks/recount-unify/snaptron/deploy/snaptron_schema_index.sql else cat all.sjs.motifs.merged.tsv | pypy /storage2/cwilks/recount-unify/annotate/annotate_sjs.py --compiled-annotations /storage2/cwilks/recount-pump-refs/hg38_unify/annotated_junctions.tsv.gz --motif-correct --compilation-id 102 | bgzip -@ 7 > junctions.bgz touch junctions.sqlite fi tabix -s2 -b3 -e4 junctions.bgz if [[ "False" == "True" ]]; then /storage2/cwilks/recount-unify/scripts/join_railID_to_sample_metadata.sh ids.tsv > samples.tsv /storage2/cwilks/recount-unify/snaptron/deploy/build_lucene_indexes.sh samples.tsv all else touch samples.tsv lucene_indexed_numeric_types.tsv samples.fields.tsv mkdir -p lucene_full_standard lucene_full_ws fi [Mon Dec 14 21:42:10 2020] Finished job 8. 19 of 24 steps (79%) done [Mon Dec 14 21:42:10 2020] rule mmformat_sjs: input: junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated output: junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.MM.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.RR.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.ID.gz jobid: 5 wildcards: study=lieber_phase2_hippo, type=all threads: 2 if [[ -s junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated ]]; then cut -f 11 junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated | tr , \\n | fgrep ':' | cut -d':' -f1 | sort -nu > junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated.sids num_samples=`cat junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated.sids | wc -l` cat junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated | /storage2/cwilks/recount-unify/scripts/mmformat -n ${num_samples} -p "lieber_phase2_hippo.all" -s junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated.sids > lieber_phase2_hippo.all.RR 2> junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.MM.gz.run mv lieber_phase2_hippo.all.mm lieber_phase2_hippo.all.MM cat lieber_phase2_hippo.all.MM | pigz --fast -p 2 > junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.MM.gz cat lieber_phase2_hippo.all.RR | pigz --fast -p 2 > junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.RR.gz cat <(echo "rail_id") junction_counts_per_study/lieber_phase2_hippo.all.sj.merged.motifs.annotated.sids | pigz --fast -p 2 > junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.ID.gz else touch junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.MM.gz junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.RR.gz fi [Mon Dec 14 21:43:24 2020] Finished job 2. 20 of 24 steps (83%) done [Mon Dec 14 21:46:36 2020] Finished job 6. 21 of 24 steps (88%) done [Mon Dec 14 21:48:27 2020] Finished job 5. 22 of 24 steps (92%) done [Mon Dec 14 21:48:27 2020] rule final_recount_output: input: junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.MM.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.RR.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.all.ID.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.MM.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.RR.gz, junction_counts_per_study/sra.junctions.lieber_phase2_hippo.unique.ID.gz output: junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.all.MM.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.all.RR.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.all.ID.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.unique.MM.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.unique.RR.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.unique.ID.gz jobid: 3 Job counts: count jobs 1 final_recount_output 1 [Mon Dec 14 21:48:28 2020] Finished job 3. 23 of 24 steps (96%) done [Mon Dec 14 21:48:28 2020] localrule all: input: all.sjs.motifs.merged.tsv, junctions.bgz, junctions.bgz.tbi, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.all.MM.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.all.RR.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.all.ID.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.unique.MM.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.unique.RR.gz, junction_counts_per_study/po/lieber_phase2_hippo/sra.junctions.lieber_phase2_hippo.unique.ID.gz, junctions.sqlite jobid: 0 [Mon Dec 14 21:48:28 2020] Finished job 0. 24 of 24 steps (100%) done Complete log: /storage2/cwilks/recount-unify/lieber_phase2_hippo/.snakemake/log/2020-12-14T210323.295933.snakemake.log Command being timed: "snakemake -j 20 --stats ./perstudy.jxs.stats.json --snakefile /storage2/cwilks/recount-unify/Snakefile.study_jxs -p --config input=links staging=unified_jxs annotated_sjs=/storage2/cwilks/recount-pump-refs/hg38_unify/annotated_junctions.tsv.gz ref_sizes=/storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.chr_sizes.tsv ref_fasta=/storage2/cwilks/recount-pump-refs/hg38_unify/recount_pump.fa sample_ids_file=ids.tsv compilation_id=102 build_sqlitedb=1 compilation=sra study_dir=junction_counts_per_study" User time (seconds): 4986.84 System time (seconds): 247.63 Percent of CPU this job got: 193% Elapsed (wall clock) time (h:mm:ss or m:ss): 45:05.58 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3029416 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 25046130 Voluntary context switches: 5417101 Involuntary context switches: 11491 Swaps: 0 File system inputs: 160 File system outputs: 70335632 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0