All Collections
Organization Features
Custom Workflows and Assets
Custom Workflows and Assets

Setting up your own custom workflows, including Nextflow, for your organization in One Codex

Denise Lynch avatar
Written by Denise Lynch
Updated over a week ago

Introduction

One Codex is known for our comprehensive microbial databases and accompanying classification jobs. While we provide additional analysis jobs for our users, we know that there are cases where you may already have your own private workflows, or cases where you need to perform analyses that are not currently available through One Codex. We also know that bioinformaticians writing pipelines are not necessarily the same people provisioning servers to run those pipelines. One of the less-obvious expertises of the One Codex team is the knowhow to provision and automate server initiation with the parameters required for a specific workflow. We've put this knowledge to use to allow our customers to set up their own pipelines without needing to worry about servers.

Custom Workflows

Custom Workflows is a new feature to One Codex, which combines our engineering expertise to manage and automate server provisioning, with your private analysis jobs. You'll find this new feature under the "Run Analyses" menu item on the left-hand navigation bar.

Creating Workflows

To create a new Workflow, there's a "Create a New Job" button on the Run Analyses page that will take you to the workflow creation screen.

On the workflow creation screen, you will see a number of fields to help setup your new job:

  • Job Type: choose between Shell Script or Nextflow pipeline (descriptions of each below)

  • Job Name

  • Job Description (optional)

  • Image: the base image on which you will run your workflow

  • Git Repository (optional): If your workflow is contained within a GitHub repository, you can provide a HTTPS URL to that repository. This will allow the job to clone the repository to the working directory in your workflow's instance, making it available to use in your script. (See below for GitHub Integration).

  • Repository Tag (optional): If working with a Git repository, you can select a specific tag to clone. Alternatively we will clone the repository's default branch.

  • Assets (optional): An optional method of importing files required for your pipelines.

  • For Shell Scripts, we also provide sliders to choose your required CPUs, Memory, and Storage.

Job Types

Workflows can be one of two types: Shell Scripts and Nextflow pipelines.

Shell Scripts

For shell scripted workflows, we provide a selection of images to choose from as the base image for your workflow. These come with some pre-installed tools to help you get started.

  • Default (Ubuntu): Base Ubuntu image with no additional tools installed. (Note that all other images are also based on Ubuntu or Debian)

  • Minimap2: Minimap2 aligner pre-installed. Useful if your script is intended for alignment.

  • De Novo Assembler (Shovill): Shovill assembler is pre-installed. Useful for genome assembly.

  • Jupyter Notebook: Includes the One Codex notebook tooling, along with the One Codex library. Useful for exploring One Codex samples and results in a more in-depth way. Can also be used to generate PDF summary reports.

  • SARS-CoV-2 Report: Includes NextClade, and some other tools from our SARS-CoV-2 public repository.

Shell scripted workflows also require you to specify basic server parameters, including the number of CPUs, memory, and storage requirements for your job.

Nextflow Pipelines

For Nextflow pipelines, we provide a selection of recent Nextflow images to work from. (If you require an image that is not listed in the dropdown menu, please reach out to the team!)

As resource requirements (CPU, memory, storage) can be specified within Nextflow pipelines and for specific sub-tasks, we don't provide an interface to set those requirements for Nextflow pipelines.

Workflow States

You can view your available workflows by choosing "Your Jobs" on the "Run Analyses" page.

Workflows can be in one of two states: Draft or Published. By default, workflows are created in a "Draft" state. Workflows in a draft state can be edited and re-run as many times as you like, but only on samples that you as the workflow creator have uploaded (not other user's samples).

Once you're ready to go live, you "Publish" a workflow. Once a workflow is published, only the name and description can be edited, but anyone in your organization can run a workflow that's published. You will be able to run published workflows on any samples that you have access to, not just samples that you own, so long as the sample owner has the required permissions to access the workflow. You will also have the option to automatically run your published workflow on any new samples that get uploaded to the organization. Published workflows cannot be rerun on the same sample multiple times.

If you want to revise a published workflow, you can copy the workflow to a new draft and modify from there.

Monitoring Running Workflows

You can view the progress of any of your running analyses on the "View Analyses" page. This also gives you an at-a-glance view of all of your custom Workflow runs. Clicking in to an individual analysis run will allow you to view live logs while the job is running (bottom panel).

Viewing Results

By default for Nextflow runs, we create a tarball of any files that remain in the working directory at the end of workflow execution, and make this tarball available from the "Other Analyses" section of the results dropdown for any sample that your custom workflows have been run on.

For shell scripts, all files in the top-level working directory will be available from the "Other Analyses" section of the sample's results options. Note that shell script results are not tarballed, and sub-directories and nested files will not be included in the outputs. Any files that you wish to make available at the end of a run will need to be in the top-level working directory.

You can also access this section directly by visiting https://app.onecodex.com/analysis/<analysis uuid>/files

GitHub Integration

In order to be able to access your GitHub repositories for your Custom Workflows, we have developed a GitHub Integration. This allows you to connect your One Codex account to your GitHub account. You can also connect your One Codex organization to your organization's GitHub account. For more details on GitHub integration, see our GitHub Integration article.

Custom Assets

A complementary feature for Custom Workflows is Assets. You may have some files or documents that you want to make available for your Workflows or for other members of your organization, which are not on GitHub. Our Assets feature allows you to upload those files for use across your organization and within your Custom Workflows. Visit the Assets page to upload Assets to your organization, and view available assets. You'll find the Assets page by clicking "Run Analyses" on the left-hand navigation bar, and choosing "Assets" at the top-right of the page.

When creating a new Custom Workflow, you'll see an option to add one or multiple Assets to your job. This will allow you to access these assets from within your workflow instance, and will be stored in the /share directory.

Did this answer your question?