Snakemake workflow: dna-seq-benchmark

A Snakemake workflow for benchmarking variant calling approaches with Genome in a Bottle (GIAB) data (and other custom benchmark datasets). The workflow uses a combination of bedtools, mosdepth, rtg-tools, pandas and datavzrd.

Usage

The usage of this workflow is described in the Snakemake Workflow Catalog.

If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) benchmark-giabsitory and its DOI (see above).

Output

The workflow writes both final deliverables and intermediate files under results/.

Primary result tables

results/fp-fn/callsets/<callset>.{fp|fn}.tsv: aggregated FP/FN tables per callset across coverages
results/fp-fn/benchmarks/<benchmark>.{fp|fn}.tsv: aggregated FP/FN tables per benchmark
results/precision-recall/benchmarks/<benchmark>.<snvs|indels>.<base|vaf-stratified>.tsv: aggregated precision/recall tables per benchmark (optionally stratified by vaf)
results/annotated/tsv/<benchmark>/: annotated shared FN tables
results/annotated/tsv/<benchmark>/<callset>.unique_<fp|fn>.annotated.tsv: annotated unique FP/FN tables
results/fp-fn/vcf/: VCFs generated from shared/unique FP/FN tables

Intermediates and automatic cleanup

Raw somatic extraction tables are written to results/intermediate/fp-fn/raw/callsets/.
Several per-coverage and per-callset aggregation inputs are marked as Snakemake temp() outputs and are removed automatically once downstream targets are finished.
If you want to keep all intermediates for debugging, run Snakemake with --notemp.

Name		Name	Last commit message	Last commit date
Latest commit History 241 Commits
.github		.github
.template		.template
.test/config		.test/config
config		config
workflow		workflow
.gitignore		.gitignore
.snakemake-workflow-catalog.yml		.snakemake-workflow-catalog.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snakemake workflow: dna-seq-benchmark

Usage

Output

Primary result tables

Intermediates and automatic cleanup

About

Uh oh!

Releases 52

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Snakemake workflow: dna-seq-benchmark

Usage

Output

Primary result tables

Intermediates and automatic cleanup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 52

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages