Skip to main content
The ATCC Discrepancy Report

Learn what the ATCC Discrepancy Report can do for you! Available to Supporting Members.

Denise Lynch avatar
Written by Denise Lynch
Updated over a week ago

An exciting new analysis available to ATCC Supporting Members is known as the Discrepancy Report. In this article, we'll describe the common use cases for the Discrepancy Report, how it works, and how to run it to assess your own sequencing data of the ATCC bacterial products you work with.

Background

Repeated strain passage in a laboratory setting can lead to a myriad of issues due to the potential accumulation of mutations and genetic drift within the population being cultured. With each passage, there is a selection pressure that can favor certain variants over others, altering the genetic composition of the population. Additionally, the process of passage itself can introduce errors or contaminants, further skewing results and undermining the reliability of experimental data. Having the ability to monitor and assess for drift from the source organism is particularly valuable in understanding potentially unexpected sources of phenotypes in the strains you are working with.

Over the last few years, ATCC have been sequencing and assembling thousands of their microbial products, to provide the highest quality assemblies from their source organisms. Now ATCC Supporting Members have the opportunity to compare their passaged and sequenced products back to the source genome, to identify any potential drift or differences your organisms may have from the source.

Discrepancy Report Overview

The ATCC Discrepancy Report is designed to assess whether your raw Illumina sequence data (fastq, optionally gzipped) for the bacterial strain that you've been working with, still matches the ATCC genome that you had purchased. The report performs the following steps:

  • Quality trimming of your fastq files

  • Identification of SNPs in your sequence data relative to the specified ATCC assembly

  • Calculation of sequence and SNP depth across the genome

  • Estimation of ANI (Average Nucleotide Identity) between your sequenced bacterial isolate and the specified ATCC assembly

This analysis generates a PDF report, giving you a convenient view of the details you need. The report contains a summary of the variants and ANI, a coverage chart across the genome, and a table of the top high frequency variants that were detected. You will also be able to download the complete table of variants identified.

Example Reports

Some ATCC organisms have previously been sequenced, assembled, and made publicly accessible via GenBank. Any GenBank assemblies of ATCC products are not official ATCC-assembled genomes, are not from verified sources, and likely have been sequenced after several passages. With this in mind, it is possible that these assemblies contain variants relative to the ATCC source genome, and as such these public assemblies are no longer the same genome available from ATCC.

We took a couple of public assemblies of ATCC strains from GenBank, and simulated sequence reads from these assemblies. With these simulated reads, we compared the apparent organism back to the official ATCC strain, to identify which SNPs can be found in the public assembly compared to the ATCC source. You can view these example reports at the below links:

Launching Your ATCC Discrepancy Report

ATCC has partnered with One Codex to bring the Discrepancy Report to ATCC Supporting Members. You can find a button to run the Discrepancy Report analysis on the Overview page for your bacterial genome of interest.

Clicking the "Run Discrepancy Report" button will take you to One Codex - ATCC's partner for genome portal management. You will be warned that you will be redirected, and the opportunity to cancel.

Alternatively, you can go directly to One Codex. You will find the ATCC Discrepancy Report launch page from the left-hand navigation bar, under the "ATCC" section.

On One Codex, you will see a description of the Discrepancy Report, along with the accepted file types (Illumina sequence data, in fastq format, optionally gzipped).

  1. If you have already uploaded your sample to One Codex, you can search for your sample. Alternatively you can upload your sample directly on this page. Uploading can take a few moments, depending on your sample's file size. Once your sample is validated, you will be able to launch the report.

  2. If you have come from a genome on the ATCC Genome Portal, that genome will be pre-selected for you in Step 2. You can choose a different genome instead if needed.

  3. Click to Run the Discrepancy Report.

Once the Discrepancy Report launches, you will be show a notification at the top of the page informing you that it is running. The Discrepancy Report can take ~20 minutes to run, depending on your file size. The notification at the top of the page will provide you with a link to your samples page, where your results will become available once they have completed. Your samples page will show a "Pending" notification per sample.

Samples on One Codex are represented as cards, such as the below image.

You can also switch to a table view option.

The "Pending" button will change to "View Results" once your analyses are ready.

If any analyses complete successfully for your sample, you will see the "View Results" button. This includes the One Codex Classification analysis that is run by default on all samples uploaded to One Codex. If you click to "View Results", your Discrepancy Report should be the primary result displayed. If it is not displayed immediately, the analysis may still be running.

To switch between analyses that have been performed for a sample, you can click the "View Results" button for the sample, and then choose from the analyses that have been run from the drop-down menu at the top-right of the results page.

Become an ATCC Supporting Member

ATCC Discrepancy Reports are available to Supporting Members. To become an ATCC Supporting Member, visit the ATCC Genome Portal. You'll find details on the various membership plans here.

Did this answer your question?