A short introduction to Galaxy

Overview
Creative Commons License: CC-BY Questions:
  • How to get started in Galaxy

Objectives:
  • Learn how to upload a file

  • Learn how to use a tool

  • Learn how to view results

  • Learn how to view histories

  • Learn how to extract and run a workflow

  • Learn how to share a history

Time estimation: 40 minutes
Level: Introductory Introductory
Supporting Materials:
Published: Aug 27, 2018
Last modification: May 3, 2024
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
purl PURL: https://gxy.io/GTN:T00190
rating Rating: 4.8 (57 recent ratings, 500 all time)
version Revision: 49

Overview

  • This is a short introduction to the Galaxy user interface - the web page that you interact with.
  • We will cover key tasks in Galaxy: uploading files, using tools, viewing histories, and running workflows.
Agenda
  1. Overview
    1. What does Galaxy look like?
  2. Key Galaxy actions
    1. Name your current history
    2. Upload a file
    3. Use a tool
    4. View results
    5. Run another tool
    6. Re-run that tool with changed settings
    7. Share your history
    8. Convert your analysis history into a workflow
    9. Create a new history
    10. Look at all your histories
    11. Run workflow in the new history
  3. Conclusion

What does Galaxy look like?

Hands-on: Log in to Galaxy
  1. Open your favorite browser (Chrome, Safari or Firefox as your browser, not Internet Explorer!)
  2. Browse to your Galaxy instance
  3. Log in or register

Screenshot of Galaxy Australia with the register or login button highlighted.

Comment: Different Galaxy servers

This is an image of Galaxy Australia, located at usegalaxy.org.au

The particular Galaxy server that you are using may look slightly different and have a different web address.

You can also find more possible Galaxy servers at the top of this tutorial in Available on these Galaxies

The Galaxy homepage is divided into three panels:

  • Tools on the left
  • Viewing panel in the middle
  • History of analysis and files on the right

Screenshot of the Galaxy interface with aforementioned structure.

The first time you use Galaxy, there will be no files in your history panel.

Key Galaxy actions

Name your current history

Your “History” is in the panel at the right.

Hands-on: Name history
  1. Go to the History panel (on the right)
  2. Click on galaxy-pencil (Edit) next to the history name (which by default is “Unnamed history”)

    Screenshot of the galaxy interface with the history name being edited, it currently reads "Unnamed history", the default value. An input box is below it.

    Comment

    In some previous versions of Galaxy, you will need to click on the history name to rename it as shown here: Screenshot of the galaxy interface with the history name being edited, it currently reads "Unnamed history", the default value.

  3. Type in a new name, for example, “My Analysis”
  4. Click on Save
Comment: Renaming not an option?

If renaming does not work, it is possible you aren’t logged in, so try logging in to Galaxy first. Anonymous users are only permitted to have one history, and they cannot rename it.

Upload a file

Your “Tools” are in the panel at the left.

Hands-on: Upload a file from URL
  1. At the top of the Tools panel (on the left), click galaxy-upload Upload

    upload data button shown in the galaxy interface.

    This brings up a box:

    the complicated galaxy upload dialog, the 'regular' tab is active with a large textarea to paste subsequent URL.

  2. Click Paste/Fetch data
  3. Paste in the address of a file:

    https://zenodo.org/record/582600/files/mutant_R1.fastq
    
  4. Click Start
  5. Click Close

Your uploaded file is now in your current history. When the file has uploaded to Galaxy, it will turn green.

Comment

After this you will see your first history item (called a “dataset”) in Galaxy’s right panel. It will go through the gray (preparing/queued) and yellow (running) states to become green (success).

Sometimes during courses, data upload gets a little slow. You can also import data through a history link.

  1. Import history from: example input history

    1. Open the link to the shared history
    2. Click on the new-history Import history button on the top right
    3. Enter a title for the new history
    4. Click on Import

  2. Rename galaxy-pencil the the history to your name of choice.

What is this file?

Hands-on: View the dataset content
  1. Click on the galaxy-eye (eye) icon next to the dataset name, to look at the file content

    galaxy history view showing a single dataset mutant_r1.fastq. Display link is being hovered.

The contents of the file will be displayed in the central Galaxy panel.

This file contains DNA sequencing reads from a bacteria, in FASTQ format:

preview of a fastq file showing the 4 line structure described in fig caption. 3 reads are shown.Open image in new tab

Figure 1: A FastQ file of course has four lines per record: the record identifier (`@mutant-no_snps.gff-24960/`), the sequence (`AATG…`), the plus character (`+`), and then the quality scores for the sequence (`5??A…`).

Use a tool

Let’s look at the quality of the reads in this file.

Hands-on: Use a tool
  1. Type FastQC in the tools panel search box (top)
  2. Click on the FastQC ( Galaxy version 0.73+galaxy0) tool

    The tool will be displayed in the central Galaxy panel.

  3. Select the following parameters:
    • param-file “Raw read data from your current history”: the FASTQ dataset that we uploaded
    • No change in the other parameters
  4. Click Execute

This tool will run and two new output datasets will appear at the top of your history panel.

Tools are frequently updated to new versions. Your Galaxy may have multiple versions of the same tool available. By default, you will be shown the latest version of the tool. This may NOT be the same tool used in the tutorial you are accessing. Furthermore, if you use a newer tool in one step, and try using an older tool in the next step… this may fail! To ensure you use the same tool versions of a given tutorial, use the Tutorial mode feature.

  • Open your Galaxy server
  • Click on the curriculum icon on the top menu, this will open the GTN inside Galaxy.
  • Navigate to your tutorial
  • Tool names in tutorials will be blue buttons that open the correct tool for you
  • Note: this does not work for all tutorials (yet) gif showing how GTN-in-Galaxy works
  • You can click anywhere in the grey-ed out area outside of the tutorial box to return back to the Galaxy analytical interface
Warning: Not all browsers work!
  • We’ve had some issues with Tutorial mode on Safari for Mac users.
  • Try a different browser if you aren’t seeing the button.

View results

We will now look at the output dataset called FastQC on data 1: Webpage.

Comment
  • Note that Galaxy has given this dataset a name according to both the tool name (“FastQC”) and the input (“data 1”) that it used.
  • The name “data 1” means the dataset number 1 in Galaxy’s current history (our FASTQ file).
Hands-on: View results
  • Once it’s green, click on the galaxy-eye (eye) icon next to the “Webpage” output dataset.

    The information is displayed in the central panel

    Graph from fastqc's report. fastqc's images themselves are inaccessible, but this graph shows overall mostly green (good) sequences scores across the length of the read.

This tool has summarised information about all of the reads in our FASTQ file.

Question
  1. What was the length of the reads in the input FASTQ file?
  2. Do these reads have higher quality scores in the centre or at the ends?
  1. 150 bp
  2. In the center

Run another tool

Let’s run a tool to filter out lower-quality reads from our FASTQ file.

Hands-on: Run another tool
  1. Type Filter by quality in the tools panel search box (top)
  2. Click on the tool Filter by quality ( Galaxy version 1.0.2+galaxy0)
  3. Set the following parameters:
    • param-file “Input FASTQ file”: our initial FASTQ dataset
    • “Quality cut-off value”: 35
    • “Percent of bases in sequence that must have quality equal to / higher than cut-off value”: 80
  4. Click Execute

After the tool has run, its output dataset will appear at the top of your History panel.

  • This dataset will be called “Filter by quality on data 1”.
  • Remember that Galaxy has named this file according to the tool it used (“Filter by quality”) and the input dataset (“data 1”).
  • The actual numbers in front of the datasets in the history are not important.

What are the results from this filtering tool?

We could click on the eye icon to view the contents of this output file, but it will not be very informative - we will just see a list of reads.

Hands-on: Get metadata about a file
  1. Click on the output dataset name in the History panel.

    This expands the information about the file.

    Diagram of how to locate the information. As above, clicking on the name expands the dataset, and an info section is shown with the filter settings. 1786 (14%) discarded.

Question

How many read has been discarded

1786 low-quality reads were discarded

Re-run that tool with changed settings

We can now try to filter our input reads to an even higher standard, and see how this changes the resulting output (an exploratory analysis). We will change the filter settings and re-run the tool.

Hands-on: Re-run the tool
  1. Click on the galaxy-refresh icon (Run this job again) for the output dataset of Filter by quality tool

    A dataset is expanded showing the Run Job Again button highlighted.

    This brings up the tool interface in the central panel with the parameters set to the values used previously to generate this dataset.

  2. Change the settings to something even stricter

    For example, you might decide you want 80 percent of bases to have a quality of 36 or higher, instead of 35.

  3. Click Execute
  4. View the results: Click on the output dataset name to expand the information

    Comment

    Not the galaxy-eye (eye) icon.

Question

How many reads were discarded under these new filtering conditions?

If you selected 80% of bases with 36 as quality cut-off, then 11517 reads (92%) should have been discarded, which indicates that we have gone too far with the filtering in this case.

You can re-run a tool many times with different settings. Each time you re-run the tool, its new output datasets will appear at the top of your current history.

Share your history

Finally, let’s imagine that you had a problem in your analysis and you want to ask for help. The easiest way to ask for help is to share your history. Try and create a link for your history and share it with…yourself!

Sharing your history allows others to import and access the datasets, parameters, and steps of your history.

Access the history sharing menu via the History Options dropdown (galaxy-history-options), and clicking “history-share Share or Publish”

  1. Share via link
    • Open the History Options galaxy-history-options menu at the top of your history panel and select “history-share Share or Publish”
      • galaxy-toggle Make History accessible
      • A Share Link will appear that you give to others
    • Anybody who has this link can view and copy your history
  2. Publish your history
    • galaxy-toggle Make History publicly available in Published Histories
    • Anybody on this Galaxy server will see your history listed under the Shared Data menu
  3. Share only with another user.
    • Click the Share with a user button at the bottom
    • Enter an email address for the user you want to share with
    • Your history will be shared only with this user.
  4. Finding histories others have shared with me
    • Click on User menu on the top bar
    • Select Histories shared with me
    • Here you will see all the histories others have shared with you directly

Note: If you want to make changes to your history without affecting the shared version, make a copy by going to History Options galaxy-history-options icon in your history and clicking Copy this History

Convert your analysis history into a workflow

When you look carefully at your history, you can see that it contains all the steps of our analysis, from the beginning (at the bottom) to the end (on top). The history in Galaxy records details of every tool you run and preserves all parameter settings applied at each step. But when you need to analyze new data, it would be tedious to do each step one-by-one again. Wouldn’t it be nice to just convert this history into a workflow that we will be able to execute again and again?

Galaxy makes this very easy with the Extract workflow option. This means any time you want to build a workflow, you can just perform the steps once manually, and then convert it to a workflow, so that next time it will be a lot less work to do the same analysis.

Hands-on: Extract workflow
  1. Clean up your history: remove any failed (red) jobs from your history by clicking on the galaxy-delete button.

    This will make the creation of the workflow easier.

  2. Click on galaxy-history-options (History options) at the top of your history panel and select Extract workflow.

    'Extract Workflow' entry in the history options menu.

    The central panel will show the content of the history in reverse order (oldest on top), and you will be able to choose which steps to include in the workflow.

    Selection of steps for Extract Workflow from history. All three of fastqc, filter by quality, and the second filter by quality are selected.

  3. Replace the Workflow name to something more descriptive, for example: QC and filtering.

  4. Rename the workflow input in the box at the top of second column to: FASTQ reads

  5. If there are any steps that shouldn’t be included in the workflow, you can uncheck them in the first column of boxes. In this case, uncheck the second Filter by quality tool at the bottom, where we used a too high quality cut-off.

  6. Click on the Create Workflow button near the top.

    You will get a message that the workflow was created.

In a minute we will see how to find the extracted workflow and how to use it.

Create a new history

Let’s create a new history.

Hands-on: New history
  1. Create a new history

    Click the new-history icon at the top of the history panel:

    UI for creating new history

  2. Rename your history, e.g. “Next Analysis”

    1. Click on galaxy-pencil (Edit) next to the history name (which by default is “Unnamed history”)
    2. Type the new name
    3. Click on Save

    If you do not have the galaxy-pencil (Edit) next to the history name:

    1. Click on Unnamed history (or the current name of the history) (Click to rename history) at the top of your history panel
    2. Type the new name
    3. Press Enter

This new history does not have any datasets in it yet.

Look at all your histories

Where is your first history, called “My Analysis”?

Hands-on: View histories
  1. Click on galaxy-history-options (History options) and then click on the galaxy-columns Show Histories side-by-side

    History options menu dropdown showing you have 163 histories, and a show histories side-by-side button.

    A new page will appear with all your histories displayed here.

  2. Copy a dataset into your new history
    1. Click on the FASTQ dataset in “My Analysis” history
    2. Drag it into the “Next Analysis” history
    Gif of copying datasets between histories in the side-by-side history view. For now this feature is not keyboard accessible, it is a known issue.Open image in new tab

    Figure 2: Copy a dataset between histories by dragging it

    This makes a copy of the dataset in the new history (without actually using additional disk space).

  3. Click on the Home icon galaxy-home (or Analyze Data on older versions of Galaxy) in the top panel to go back to your analysis window

Your main Galaxy window will now show “Next Analysis” as the current history, and it will have one dataset in it.

At any time, you can go back into the “View all histories” page and “Switch to” a different history.

Run workflow in the new history

Now that we have built our workflow, let’s use it to re-create our small analysis in a single step. The same workflow could also be used on some new FASTQ data to quickly repeat the same analysis on different inputs.

Hands-on: Run workflow
  1. Click on Workflow in the top menu bar of Galaxy.

    Here you have a list of all your workflows. Your newly created workflow should be listed at the top:

    Workflow list page showing a single workflow named QC and filtering.

    If you click on a workflow name, you can see all available actions for the workflow, e.g. edit, copy, rename, delete.

  2. Click on the workflow-run (Run workflow) button next to your workflow.

    The central panel will change to allow you to configure and launch the workflow.

    Run workflow form with a single input: FASTQ reads. mutant_r1.fastq is selected as the input dataset for that parameter.

  3. Check that the “FASTQ reads” input is set to the FASTQ dataset we have copied to the new history.

    In this page we could change any parameter for the tools composing the workflow as we would do when running them one by one.

  4. Click the Run Workflow button at the top-right of the screen.

    You should see a message that the workflow was successfully invoked. Then jobs will start to run and datasets appear in your “Next Analysis” history, replicating the steps of your previous history.

Conclusion

Well done! You have completed the short introduction to Galaxy, where you named the history, uploaded a file, used a tool, viewed results and run a workflow. Additional tutorials are available for a more in-depth introduction to Galaxy’s features.