A number of pipelines were used to generate the assemblies, their QC, and annotations available in the ATCC genome portal. Here, we provide a change log for those pipelines. If you are unsure which pipeline or version was used for your genome of interest, reach out to us through the message box at the bottom-right of your screen.
Assembly Pipelines
Bacteriology
Date: April 25, 2019
Initial Bacterial hybrid assembly pipeline
Runs readsQC to quality trim both Illumina and Oxford Nanopore Technologies (ONT) reads
Runs Unicycler (v0.4.4) to assemble genome
Date: August 1, 2019
Changes:
Runs fastp to trim Illumina reads
Runs filtlong to trim ONT reads
Downsamples Illumina reads to 150X genome depth and ONT reads to 60X
Updates Unicycler to v0.4.8
Date: December 11, 2019
Changes:
Downsamples ONT reads to 30X genome depth
Virology
Date: July 15, 2020
Runs fastp on Illumina reads
Uses SPAdes to assemble
minimap2 aligns trimmed reads to assembly
Masks low depth (<10X) regions
Date: April 29, 2021
Changes:
Trims terminal masked regions from assemblies
Adds modification to Unicycler to raise exception if Racon runs out of memory
Mycology
Date: July 10, 2020
Initial assembly pipeline for hybrid fungal assemblies
Runs fastp on Illumina reads
Runs Filtlong on ONT reads
Runs MaSuRCA with FLYE on filtered read sets
Date: March 23, 2021
Changes:
Estimates genome size on Illumina reads
Adds downsampling of filtered Illumina reads to 150X depth of estimated genome size
Adds downsampling of ONT reads to 30X depth of estimated genome size
Assembly QC pipeline
Bacteriology
Date: October 30, 2018
Initial bacterial hybrid assembly QC pipeline
Uses CheckM to assess assembly quality, completion and contamination
Date: June 7, 2019
Changes:
Maps trimmed ONT reads to assembly using minimap2 to calculate ONT depth
Maps trimmed Illumina reads to assembly using BWA
Adds custom script to calculate other assembly statistics
Virology
Date: July 13, 2020
Initial virology assembly QC pipeline
Aligns contigs to reference database to identify the best reference species
Checks if all segments in each reference species are present in assembly, using GenBank segment information
Reports alignment quality
Date: August 05, 2020
Changes:
Includes sub-species sequences in reference database
Date: December 10, 2020
Changes:
Uses alignment results to identify segments, in place of GenBank segment information
Date: January 21, 2021
Changes:
Calculates assembly completeness score (assembly length / reference length)
Mycology
Date: September 24, 2020
Initial mycology assembly QC pipeline
Maps raw reads to assembly to calculate depth
Runs BUSCO 4.1.2 with BUSCO database Fungi_ODB10 to calculate assembly completeness score
Calculates additional assembly statistics
Annotation Pipelines
Bacteriology
Date: September 7, 2018
Initial bacterial annotation pipeline
Uses prokka with genus-specific BLAST database
Date: September 18, 2019
Changes:
Does not use genus-specific BLAST database
Virology (Variant calling)
Date: October 18, 2020
blastn aligns reference genome to assembly to identify matching segments
Uses MAFFT to align matching reference and assembly segments
Custom script examines MAAFT alignment results for variant types
Date: November 24, 2020
Changes:
Improves method for identifying variant types
Mycology
Date: October 09, 2020
Initial annotation pipeline for fungal assemblies
Runs BUSCO 4.1.2 with BUSCO database Fungi_ODB10 for annotations of Universal Single-Copy Orthologs