As life science research progresses, the quality of data becomes increasingly more important. With ATCC's Enhanced Authentication Initiative, we aim to enrich the characterization of our biological collections and provide you with the whole-genome sequences of the specific, authenticated materials you need to generate credible data.
The purpose of this technical documentation is to outline the features of the ATCC Genome Portal as well as provide comprehensive descriptions of the DNA extraction, sequencing, and bioinformatic methods we use to produce high-quality, reference-grade genomes.
ATCC Genome Portal
The ATCC Genome Portal offers more than just a collection of reference-grade bacterial genomes originating from authenticated ATCC materials. It is a platform where users can interactively browse genomic data and metadata that is both searchable and indexed.
Browse and download whole-genome sequences and annotations of ATCC microbial products
Search for nucleotide sequences or genes within published genomes
Search for genomes by taxonomic name, taxonomic level, isolation source, ATCC catalog number, type strain status, or biosafety level
View genome assembly statistics and quality metrics
Identify the relatedness of published genomes by total genome alignment
Purchase the corresponding authenticated ATCC source material
The ATCC Approach to Bacterial Genome Sequencing
After multiple decades of bacterial DNA sequencing, a plethora of techniques exist to sequence and assemble bacterial genomes [1, 2]. At ATCC, we are setting the scientific standard in best practices for bacterial whole-genome sequencing as part of our Enhanced Authentication Initiative.
Recent innovation in third-generation sequencing [3, 4] have now made it possible to produce complete reference-grade bacterial genomes by combining highly accurate Illumina® short reads with the revolutionary scaffolding ability of Oxford Nanopore Technologies® (ONT) ultra-long reads via so-called hybrid assembly techniques [5, 6] (for additional details see our article on Genome Assembly).
The ATCC bacterial whole-genome sequencing workflow is an optimized methodology designed to achieve complete, circularized (when biologically appropriate) bacterial genomic elements by using the hybrid assembly technique. This methodology comprises five primary steps:
Extraction of DNA from authenticated ATCC strains
Sequencing of this DNA
Assembly of sequencing data into a genome
Annotation of the resultant genome
Estimation of relatedness between a genome and all other genomes in our collection
Each step is accompanied by rigorous quality control methods and criteria to ensure that the data proceeding to the next step are the highest quality possible. Only the data that pass all quality control criteria are published to the ATCC Genome Portal. While ATCC materials undergo extensive quality control while being grown, a description of these processes is outside the scope of this document. For more information, see our whitepaper on ATCC prokaryotic authentication.
In other articles, listed below, we describe the methods and bioinformatic tools used to accomplish each step, including quality control criteria, alongside relevant scientific citations supporting our approach.
Niedringhaus TP, et al. Landscape of next-generation sequencing technologies. Analytical Chemistry, 83(12): 4327–4341, 2011. PubMed: 21612267
Loman NJ, Pallen MJ. Twenty years of bacterial genome sequencing. Nature Reviews Microbiology, 13(12): 787–794, 2015. PubMed: 26548914
Deamer D, Akeson M, Branton D. Three decades of nanopore sequencing. Nature Biotechnology, 34(5): 518–524, 2016. PubMed: 27153285
Jain M, et al. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biology, 17(1): 239, 2016. PubMed: 27887629
Maio N, et al. The REHAB Consortium. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. BioRxiv, 530824, 2019.
Wick RR, et al. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLOS Computational Biology, 13(6): e1005595, 2017. PubMed: 28594827