UPDATE: 25th March, 2021
We have just launched an update to this analysis!
Samples are now compared against the genomic sequences of the isolates in each Microbiome standard, as sequenced and assembled by ATCC in conjunction with One Codex. Previous versions of the pipeline used public assemblies for these species.
We've updated how we calculate the Relative Abundance score for each control. We now use a scoring system based on a scaled Aitchison Distance between your sample and the expected abundances of the microbes in this control. Previous versions of the pipeline used Pearson Correlation to calculate the Relative Abundance score.
If you need to run the previous analysis job on any of your new samples, or if you would like to run the new job on your old samples, reach out to us at support@onecodex.com, and we will be more than happy to help.
Getting Started
Welcome! You may have recently purchased one of the ATCC Microbiome Standards, powered by One Codex. We're going to walk you through using One Codex to analyze next-generation sequencing (NGS) data generated from those standards, whether its whole genome shotgun (WGS) or 16S amplicon sequencing data. Collectively the Microbiome Standards and One Codex analysis provide you with straightforward measures of the performance and accuracy of your microbiome sequencing.
First step: Create an account!
Register for an Account (if needed)
One Codex is a platform that lets you analyze genomic data from microbial samples. If you don't have an account, you can sign up for free now.
Just enter some basic contact info, pick a password, and start analyzing data!
Uploading Your Data
You can add sequencing data from your ATCC Microbiome Standards into your One Codex account by navigating the ATCC Standards page, and simply dragging and dropping your FASTQ file(s) directly into the One Codex website. Analysis of your ATCC Microbiome Standards is free, and any samples you analyze will not be billed.
Don't have a file handy? Feel free to use this FASTQ dataset for the rest of the tutorial: MSA-1000 example (16S).
Already uploaded your FASTQ data? If you want to analyze a file that you've already uploaded, just select that file from the menu at the top right of the page labeled "Select an existing sample."
Product Selection
Using the buttons on the left side of the ATCC Standards page, select your product type, sequencing, and specific ATCC Microbiome Standard in order to ensure that your data is analyzed appropriately. Make sure you've selected an option for each of the following:
Product Type: Whole Cell or Genomic DNA
Sequencing: Whole-Genome Shotgun (WGS) or 16S Ribosomal DNA
Microbial Mixture: MSA-1000 (10 organisms, even amounts), MSA-1001 (10 organisms, staggered amounts), etc.
Once you've uploaded or selected an existing dataset as well as the above product information, you'll see the Continue & Add Metadata button turn blue. Click the button at the bottom of the page to start entering metadata describing your sample. If you want to record a higher level of detail, click the smaller link to enter detailed metadata.
Entering Metadata
After you select your data and product type, you will be able to record some basic information about your sample, including the type of sequencing, library prep kit, etc. The exact set of questions is customized to account for whether you are analyzing a whole cell product vs. genomic DNA and performing shotgun of amplicon sequencing.
All of this metadata will be stored alongside your data and results, allowing you to go back and see how different kits and protocols for processing your samples may impact the quality or accuracy of your microbiome sequencing.
Need another option?: If you find that we're missing an answer for one of the questions and do not have an "other" field, please send us a note.
Analyzing your ATCC Microbiome Standard
After you've entered all of your metadata, you should be redirected to your analysis results. The analysis results (below) show you which organisms were detected in your sample and compare those hits against the set of organisms known to be in the input mixture of microbes from your ATCC Microbiome Standard. Note: Depending on the size of your FASTQ file, you may need to wait a few minutes for these to finish processing.
We summarize these results in three ways:
True Positives: The detection of organisms in the input mixture
Relative Abundance: The quantification of organisms in the input mixture
False Positives: The detection of organisms not in the input mixture
Each of these individual scores is summarized using a scale from 0-100%. The scores are calculated as follows:
True Positives: The number of detected organisms divided by the number of input organisms
Relative Abundance: A score between 0 and 100%, based on a scaled Aitchison distance between the detected organism abundances and the known input abundances
False Positives: A perfect score of 100%, with penalties for each false positive (FP) organism - 10% for each "high abundance" FP, 5% for each "moderate abundance" FP, and 1% for each "low abundance" FP
The Overall Score at the top of the page is simply an average of these 3 sub-scores.
Additional detail is provided for each of the individual scores, and these panels can be expanded to provide more information on which organisms were detected and their contribution to each sub-score.
Finally, the metadata you entered is available at the bottom of the page, and serves as a record of your protocol and sequencing workflow alongside this scorecard analysis of your ATCC Microbiome Standard.
Want to dive into more details? The following sections include additional technical details on the physical standards from ATCC as well as the bioinformatics hosted on One Codex.