Many of our users have developed their own private pipelines for specific analyses that they wish to perform. But they often don't have the means or desire to manage the infrastructure required to run these pipelines. This is where One Codex comes in. We now have the tools to allow you to build, test, and run your workflows on the One Codex platform, without having to worry about spinning up servers or managing the queue of samples you wish to analyze. Our team of engineers has years of experience with server provisioning and queue management, so that you don't have to!

In this article, we'll walk you through the step-by-step instructions to getting your workflow setup to run on One Codex.

Where to Create Your Workflows

To begin, click on the "Run Workflows" button on the left-hand navigation bar. At the top-right of the Workflows page, click "Create a New Workflow".

This will bring you to a new page to start building your draft workflow.

Creating a Workflow

On the workflow creation screen, you will see a number of fields to help setup your new workflow. The first section includes:

Workflow Type: choose between Shell Script or Nextflow pipeline (descriptions of each below)
Workflow Name: Choose a name for your workflow. This will be seen on other parts of the app, such as the page to run workflows, and the results menu for each sample.
Description (optional)
A checkbox to include a One Codex Bearer Token in your environment variables, to allow you to access the One Codex API from within your workflow. Learn more about Bearer Tokens.
Image: For shell scripts, this is the base image on which you will run your workflow
Nextflow version: For Nextflow pipelines, this allows you to specify the version of Nextflow you wish to use.
Git Repository: If your workflow is contained within a GitHub repository, you can provide a HTTPS URL to that repository. (Note, this is required for Nextflow workflows). This will allow the job to clone the repository to the working directory in your workflow's instance, making it available to use in your script. (See this article for GitHub Integration).
Repository Tag (optional): If working with a Git repository, you can select a specific tag to clone. Alternatively we will clone the repository's default branch.
Compute Parameters: For Shell Scripts, we also provide sliders to choose your required CPUs, Memory, and Storage. (Note that for Nextflow workflows, you can specify the requirements for child tasks within your Nextflow code).

The second section of the workflow creation screen includes:

Script: Write your script into the box provided. We provide workflow type-specific recommendations within the Script box. See the details in the Workflow Types section below.
Dependencies (optional): Allows you to specify any other workflows you/your team have developed, which should be run on samples prior to running the current workflow draft. The outputs of Dependencies will be made available for use within this workflow. Learn more about Dependencies here.
Assets (optional, only available for users within organizations): An optional method of importing files required for your pipelines. See our developer documentation for more details on assets.
Arguments (optional): Can be used throughout your script, with their values specified a time of workflow kickoff. See below for more details on Arguments.

Workflow Types

Workflows can be one of two types: Shell Scripts and Nextflow pipelines.

Shell Scripts

For shell scripted workflows, we provide a base Ubuntu Docker image. You may already have your own image, so you can instead provide the URI for your image.

Shell scripted workflows require you to specify basic server parameters, including the number of CPUs, memory, and storage requirements for your job.

We've pre-populate the "Script" with a few details and comments to get you started.

Nextflow Pipelines

For Nextflow pipelines, we provide a selection of recent Nextflow images to work from. (If you require an image that is not listed in the dropdown menu, please reach out to the team!)

As resource requirements (CPU, memory, storage) can be specified within Nextflow pipelines and for specific sub-tasks, we don't provide an interface to set those requirements for Nextflow pipelines.

The "Script" box is pre-populated with some commands that you may need to get started. Of note is that paired-end files that are uploaded to One Codex get interleaved into one file upon upload. This script provides you with a command to deinterleave the file back into the R1 and R2 files, and then pastes their names into input.csv, which is required for Nextflow pipelines.

Arguments

Arguments are variables that you can use throughout your script, and which you may want to be able to change between runs of the workflow, without having to change the script. Argument values can be set at the time of launch of the workflow on samples.

For both Shell scripts and Nextflow pipelines, we automatically pre-set the argument $OCX_SAMPLE_FILENAME with the filename from your sample. This is used as your input to your workflow.

Paired-end files are interleaved into one file upon upload to One Codex. You may need to deinterleave your sample's file if your workflow requires the R1 and R2 files to be passed separately.

For shell scripts, we allow you to specify the various arguments you might require, including their value type, while building your workflow.

For Nextflow pipelines, once you set your GitHub repository at the top of the workflow creation screen, the arguments section will get populated with the arguments from your nextflow_schema.json file in your GitHub repository.

An example of some of the arguments from a Nextflow pipeline can be seen below.

In the above screenshot, you will notice that there are two sections of arguments, corresponding to the different options under the "definitions" key of your nextflow_schema.json file.

For Shell scripts, you can choose to include any number of arguments. Creating an argument is simple. Click on the + Add field button under the Arguments section, which will provide you with a dialog box to set your argument details, such as the below.

For each argument, you will notice that we prepend $ARGS_ to the beginning of the argument name, which allows us to distinguish that these are arguments that have been set using the workflow creation page. For arguments in shell script, once you name your argument, the $ARGS_ above the name will then include the name.

To the left of each $ARGS_ name, you will see the clipboard icon, allowing you to copy the full argument name for pasting into your script.

For shell scripts, arguments can be one of 5 types: boolean, number, integer, regex, or string. We set them as strings by default, but you can choose the argument types in the dropdown menu.

You can set these arguments as "required" as needed. Any arguments that are marked as required must be passed at the time of launching an analysis using your workflow. In the case where you have set default values for your arguments in your nextflow_schema.json, the default value will be automatically passed at the time of launch through the browser, unless you override the value.

Outputs

For any files that you wish to keep and make available once the analysis is complete, place them into the /out directory. This directory will be listed on the details page of any workflow run.

Note that we will display individual files for workflows with fewer than 1,000 output files. If a workflow has >1,000 files in the out directory, the entire directory will be tarballed. Individual files will not be displayed.

Publishing your workflow

When a workflow has just been created, it is in a draft state. For the purposes of version controlling, you will want to publish your workflow so as not to accidentally change it. Learn more about the various workflow states and their permissions here!

Copying Workflows

After a workflow has been published, you can't update the script, the repository, or the image version. This is to make sure that your workflows can be version-controlled. If you need to update a published workflow for any changes, you will need to create a new workflow. However, to make that easier, each workflow card (both published and draft), has an option to Clone the workflow. This will create a new draft workflow, with the same script, image, repository and arguments, allowing you to test out your changes without impacting the published workflow!

What's Next?

Once your workflow is created, even in a draft state, you can run it on any of your samples. Learn about launching your workflows!

Creating Your Workflows