Importing Data

Import data from BaseSpace, or upload through Command Line

Denise Lynch avatar
Written by Denise Lynch
Updated over a week ago

Importing data from BaseSpace

You can import data directly from your BaseSpace account using the Upload Page and selecting the option for "Import from: BaseSpace". Once you log in with your BaseSpace credentials, you can select a group of files or projects and import them as a batch.

Import from the SRA

Some users may wish to compare their own data with samples that are publicly available on the Sequence Read Archive (SRA), or perhaps re-analyze SRA samples using the One Codex pipeline. We've made it easier to import samples from the SRA into your account directly. You'll find this option on the Upload page.

Once you select to "Import from: SRA", you'll see the below screen. You can enter one or more accessions (comma-separated) to import at a given time. The accessions that are currently accepted will be shown below the text box.

It's not just individual samples or SRA runs that you can import. We even have an option to import an entire SRA Project. Once you enter a PRJNA accession, we will determine the number of SRA "Runs" to import as samples in the platform, so that you can confirm the project (before incurring charges).

We'll draw on the metadata on these projects and samples, both to create a corresponding project in your account, and to apply as metadata to each sample, so that you can more easily compare your samples.

Trying to import a project from which some samples were already imported, will not duplicate samples. It will only supplement the project with the missing samples.

Programmatic Uploads

To complement the web interface, we also provide a command line interface (CLI) and Python client library for uploading files to One Codex. These tools enable programmatic, including fully automated, uploads to One Codex.

The CLI also supports several additional features:

  • Support for uploads >5GB

  • Automatically combines and interleaves paired-end files into a single sample

  • Saving and reloading API keys / login credentials

Installing the command-line tool

The command line interface is written in Python and accompanies our client library. It should be easily installable on most machines with the following command:

pip install onecodex   # Note, Windows users may need to do `py -m pip install onecodex`

Uploading files

If you haven't previously logged in, the following command will prompt you for your username and password and then save a ~/.onecodex file with your API key.

onecodex login

This command will automatically upload one or more FASTQs into your account, prompting you to interleave any paired-end data if applicable (which we recommend).

onecodex upload Sample1_R1_L001.fastq.gz Sample1_R2_L001.fastq.gz ...

More documentation for the CLI can be found here.

Did this answer your question?