Welcome! You may have recently purchased one of the ATCC Microbiome Standards, powered by One Codex. We're going to walk you through using One Codex to analyze next-generation sequencing (NGS) data generated from those standards, whether its whole genome shotgun (WGS) or 16S amplicon sequencing data. Collectively the Microbiome Standards and One Codex analysis provide you with straightforward measures of the performance and accuracy of your microbiome sequencing.
First step: Create an account!
Register for an Account (if needed)
One Codex is a platform that lets you analyze genomic data from microbial samples. If you don't have an account, you can sign up for free now.
Just enter some basic contact info, pick a password, and start analyzing data!
Uploading Your Data
You can add sequencing data from your ATCC Microbiome Standards into your One Codex account by navigating the ATCC Standards page, and simply dragging and dropping your FASTQ file(s) directly into the One Codex website. Analysis of your ATCC Microbiome Standards is free, and any samples you analyze will not be billed.
Don't have a file handy? Feel free to use this FASTQ dataset for the rest of the tutorial: MSA-1000 example (16S).
Already uploaded your FASTQ data? If you want to analyze a file that you've already uploaded, just select that file from the menu at the top right of the page labeled "Select an existing sample."
Using the buttons on the left side of the ATCC Standards page, select your product type, sequencing, and specific ATCC Microbiome Standard in order to ensure that your data is analyzed appropriately. Make sure you've selected an option for each of the following:
- Product Type: Whole Cell or Genomic DNA
- Sequencing: Whole-Genome Shotgun (WGS) or 16S Ribosomal DNA
- Microbial Mixture: MSA-1000 (10 organisms, even amounts), MSA-1001 (10 organisms, staggered amounts), etc.
Once you've uploaded or selected an existing dataset as well as the above product information, you'll see the Continue & Add Metadata button turn blue. Click the button at the bottom of the page to start entering metadata describing your sample. If you want to record a higher level of detail, click the smaller link to enter detailed metadata.
After you select your data and product type, you will be able to record some basic information about your sample, including the type of sequencing, library prep kit, etc. The exact set of questions is customized to account for whether you are analyzing a whole cell product vs. genomic DNA and performing shotgun of amplicon sequencing.
All of this metadata will be stored alongside your data and results, allowing you to go back and see how different kits and protocols for processing your samples may impact the quality or accuracy of your microbiome sequencing.
Need another option?: If you find that we're missing an answer for one of the questions and do not have an "other" field, please send us a note.
Analyzing your ATCC Microbiome Standard
After you've entered all of your metadata, you should be redirected to your analysis results. The analysis results (below) show you which organisms were detected in your sample and compare those hits against the set of organisms known to be in the input mixture of microbes from your ATCC Microbiome Standard. Note: Depending on the size of your FASTQ file, you may need to wait a few minutes for these to finish processing.
We summarize these results in three ways:
- True Positives: The detection of organisms in the input mixture
- Relative Abundance: The quantification of organisms in the input mixture
- False Positives: The detection of organisms not in the input mixture
Each of these individual scores is summarized using a scale from 0-100%. The scores are calculated as follows:
- True Positives: The number of detected organisms divided by the number of input organisms
- Relative Abundance: The Pearson correlation coefficient of the detected organism abundances compared to the known input abundances
- False Positives: A perfect score of 100%, with penalties for each false positive (FP) organism - 10% for each "high abundance" FP, 5% for each "moderate abundance" FP, and 1% for each "low abundance" FP
The Overall Score at the top of the page is simply an average of these 3 sub-scores.
Additional detail is provided for each of the individual scores, and these panels can be expanded to provide more information on which organisms were detected and their contribution to each sub-score.
Finally, the metadata you entered is available at the bottom of the page, and serves as a record of your protocol and sequencing workflow alongside this scorecard analysis of your ATCC Microbiome Standard.
Want to dive into more details? The following sections include additional technical details on the physical standards from ATCC as well as the bioinformatics hosted on One Codex.