Pipeline overview¶
Mapping reads and BAM file processing¶
When GPUs are available Hastings can be configured to use Nvidia's Parabricks for read mapping using fq2bam tool. This tool performs read mapping with a GPU-accelerated version of BWA-mem, sorting and marking of duplicates. See the alignment hydra-genetics module or parabricks hydra-genetics module documentation for more details on the softwares. Default hydra-genetics settings/resources are used if no configuration is specified.
When only CPUs are available Hastings can be configured perform the read mapping, sorting and duplicate marking on CPU.
- read mapping BWA-MEM
- read sorting Samtools sort
- marking duplicates with Picard MarkDuplicates
Variant Calling¶
See the snv_indels hydra-genetics module or parabricks hydra-genetics module documentation for more details on the softwares for variant calling, annotation hydra-genetics module for annotation, filtering hydra-genetics module for filtering and cnv hydra-genetics module for more details on the softwares for cnv calling. Default hydra-genetics settings/resources are used if no configuration is specified.
SNV and INDELs¶
- Parabricks DeepVariant when run on GPU or Google's DeepVariant when run on CPU
- Glnexus
- Used to create a multisample VCF file analysed with Peddy.
- Used for the creation of trio VCF files used for UPD analysis
CNVs¶
- CNV callers
- Exome depth and hydra genetics documentation ExomeDepth
Regions Of Homozygosity¶
UniParental Disomy¶
QC¶
See the qc hydra-genetics module documentation for more details on the softwares for the quality control. Default hydra-genetics settings/resources are used if no configuration is specified.
Hastings produces a MultiQC-report for the entire sequencing run to enable easier QC tracking. The report starts with a general statistics table showing the most important QC-values followed by additional QC data and diagrams. The entire MultiQC html-file is interactive and you can filter, highlight, hide or export data using the ToolBox at the right edge of the report.
- The MultiQC-report contains QC data from the following programs:
Coverage for genes and gene panels.¶
Results written to an excel spreadsheet with a tab for each gene panel.