Exporting Workflow Run RO-Crates from Galaxy

Author(s) orcid logoPaul De Geest avatar Paul De Geest
Editor(s) Marie Josse avatar Marie Josse
Overview
Questions:
  • What is a Workflow Run Crate?

  • How can I export a Galaxy Workflow Run Crate?

Objectives:
  • Understanding, viewing and creating Galaxy Workflow Run Crates

Requirements:
Time estimation: 30 minutes
Supporting Materials:
Published: May 11, 2023
Last modification: Aug 26, 2024
License: Tutorial Content is licensed under Apache-2.0. The GTN Framework is licensed under MIT
purl PURL: https://gxy.io/GTN:T00340
version Revision: 8

Workflows are a powerful Galaxy feature that allows you to scale up your analysis by performing an end-to-end analysis with a single click of a button. In order to keep provenance of the workflow invocation (an invocation of a workflow means one run or execution of the workflow) it can be exported from Galaxy in the form of a Workflow Run Crate RO-Crate profile.

Agenda

In this tutorial, you will learn how to create a git repo, and begin working with it.

  1. Enable RO-Crate on your local instance
  2. Import an example workflow
  3. Run the workflow
  4. Export the Workflow Run Crate

Additionally, the exported Workflow Run Crate allows for sharing workflow run provenance with those unfamiliar with Galaxy and its standard export format.

This tutorial will guide you through the steps of defining such a report for your workflow, .

This tutorial will show you how to generate Galaxy-based Workflow Run RO-Crate after running the workflow.

Hands-on: Choose Your Own Tutorial

This is a "Choose Your Own Tutorial" section, where you can select between multiple paths. Click one of the buttons below to select how you want to follow the tutorial

Are you running Galaxy locally ?

Enable RO-Crate on your local instance

Hands-on: Update your galaxy configuration
  • Go to where your Galaxy folder is in your computer
  • In your root Galaxy folder navigate to the config folder where a the galaxy.yml should be located. Please open it.
  • (In case you only find a galaxy.yml.sample file, copy this one and name it galaxy.yml)
  • make sure the option enable_celery_tasks is set to true:
    galaxy:
        enable_celery_tasks: true
    

    That’s it ! Now you can launch your local instance as usual.

Import an example workflow

For this tutorial, we will use the workflow from the Galaxy 101 for everyone tutorial. If you have not done this tutorial yet, the only thing you need to know is that this is a workflow that takes as input a table of data about different species of iris plants, this table is subsequently sorted and filtered, and some plots are made. The specifics of the workflow are not important for this tutorial, only that it outputs a number of different kinds of outputs (images, tables, etc).

We will start by importing this workflow into your Galaxy account:

Hands-on: Import the workflow
  1. Import the workflow into Galaxy

    Hands-on: Importing and launching a GTN workflow
    Launch Galaxy 101 for Everyone (View on GitHub, Download workflow) workflow.
    • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
    • Click on galaxy-upload Import at the top-right of the screen
    • Paste the following URL into the box labelled “Archived Workflow URL”: https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/workflow-reports/workflows/galaxy-101-everyone.ga
    • Click the Import workflow button

    Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

    Video: Importing a workflow from URL

Run the workflow

Galaxy will produce several export options for any workflow. The default export gives us a serialization of the invocation data model while the RO-Crate export gives an Workflow Run Crate which includes the default export as well.

Let’s run the workflow and export the RO-Crate.

Hands-on: Run the workflow
  1. Import the file iris.csv via link

    https://zenodo.org/record/1319069/files/iris.csv
    
    • Copy the link location
    • Click galaxy-upload Upload Data at the top of the tool panel

    • Select galaxy-wf-edit Paste/Fetch Data
    • Paste the link(s) into the text field

    • Press Start

    • Close the window

  2. Run GTN Training: Galaxy 101 For Everyone workflow using the following parameters:
    • “Send results to a new history”: No
    • “1: Iris Dataset”“: the iris.csv file we just uploaded
    • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
    • Click on the workflow-run (Run workflow) button next to your workflow
    • Configure the workflow as needed
    • Click the Run Workflow button at the top-right of the screen
    • You may have to refresh your history to see the queued jobs

  3. View the workflow outputs once the workflow has completed
    • The workflow produces several text and tabular outputs, and two plot (image) outputs

Export the Workflow Run Crate

After the workflow has completed, we can export the RO-Crate. The crate does not appear in your history, but can be accessed from the galaxy-history-options -> Show Invocations menu on the top right of your history OR on the left pannel from the galaxy-panelview Workflow Invocations .

Hands-on: Export the Workflow Run Crate
  1. In the top right of your history, go to galaxy-history-options -> Show Invocations

    screenshot of the history options to get the Show invocations button.

  2. Our latest workflow run should be listed at the top.
    • Click on it to expand it:

    screenshot of the workflow invocations menu, with our latest invocation at the top.

  3. Click on the Export tab in the expanded view of the workflow invocation.

  4. Click on the Export tab in the expanded view of the workflow invocation. You should see a page that contains three download options: - Research Object Crate (RO-Crate) - BioCompute Object - File
  5. Click on the Generate galaxy-download option of the RO-Crate box (1st box)

    screenshot of the beginning of the workflow run export options.

Great work! You have created a Workflow Run Crate. This makes it easy to track the provenance of the executed workflow.