Use Jupyter notebooks in Galaxy

Author(s) orcid logoDelphine Lariviere avatar Delphine Lariviere
Editor(s) orcid logoTeresa Müller avatar Teresa Müller
Reviewers Helena Rasche avatarSaskia Hiltemann avatarNicola Soranzo avatarDelphine Lariviere avatarNiall Beard avatarBérénice Batut avatar
Overview
Creative Commons License: CC-BY Questions:
  • How to open a Jupyter Notebook in Galaxy?

  • How to update dependencies in a Jupyter Interactive Environment?

  • How to save and share results in the Galaxy History?

Objectives:
  • Learn about the Jupyter Interactive Environment

  • Load data into a Jupyter Interactive Environment

  • Install library dependencies

  • Save a notebook to the Galaxy history

Requirements:
Time estimation: 1 hour
Supporting Materials:
Published: Jul 2, 2018
Last modification: Apr 25, 2025
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
purl PURL: https://gxy.io/GTN:T00148
rating Rating: 2.0 (0 recent ratings, 4 all time)
version Revision: 8

In this tutorial, we are going to explore the basics of using JupyterLab in Galaxy. We will use the Gapminder data as a test set to get the hang of Jupyter notebooks. The python-novice-gapminder-data.zip file is publicly available. This tutorial can also be used as an initial setup for the Software Carpentries training Plotting and Programming in Python.

Agenda

In this tutorial, we will see :

  1. What is Jupyter ?
  2. Use Jupyter notebook in Galaxy
    1. Import data
    2. Open the JupyterLab environment
    3. Start your first notebook
    4. Install and import libraries using Conda
    5. Import data
    6. Graph Display in Jupyter
    7. Export Data
    8. Save the Notebook in your history
  3. Conclusion

What is Jupyter ?

Jupyter is an interactive environment that mixes explanatory text, command line and output display for an interactive analysis environment. Its implementation in Galaxy facilitates the performance of additional analyses if there are no tools for it.

These notebooks allow you to replace any in-house script you might need to complete your analysis. You don’t need to move your data out of Galaxy. You can describe each step of your analysis in the markdown cells for an easy understanding of the processes, and save it in your history for sharing and reproducibility. In addition, thanks to Jupyter magic commands, you can use several different languages in a single notebook.

Jupyter notebook.

You can find the complete manual for Jupyter commands on Read the Docs.

Use Jupyter notebook in Galaxy

Import data

To manipulate data, we first upload the python-novice-gapminder-data.zip folder into your Galaxy history. To add the files you can either upload them locally from your computer or use Zenodo. To add the files you can either uploade them locally from your computer or use Zenodo.

Hands On: Data upload using Zenodo
  1. Create a new history for this Jupyter notebook exercise

    To create a new history simply click the new-history icon at the top of the history panel:

    UI for creating new history

  2. Import the following tabular file from Zenodo:

    https://zenodo.org/record/15263830/files/gapminder_all.csv
    https://zenodo.org/record/15263830/files/gapminder_gdp_africa.csv
    https://zenodo.org/record/15263830/files/gapminder_gdp_americas.csv
    https://zenodo.org/record/15263830/files/gapminder_gdp_asia.csv
    https://zenodo.org/record/15263830/files/gapminder_gdp_europe.csv
    https://zenodo.org/record/15263830/files/gapminder_gdp_oceania.csv
    
    • Copy the link location
    • Click galaxy-upload Upload Data at the top of the tool panel

    • Select galaxy-wf-edit Paste/Fetch Data
    • Paste the link(s) into the text field

    • Press Start

    • Close the window

  3. Make sure the files are imported as CSV by expanding the box of each imported file in your history and check the format.

  • Click on the galaxy-pencil pencil icon for the dataset to edit its attributes
  • In the central panel, click galaxy-chart-select-data Datatypes tab on the top
  • In the galaxy-chart-select-data Assign Datatype, select csv from “New type” dropdown
    • Tip: you can start typing the datatype into the field to filter the dropdown menu
  • Click the Save button

Open the JupyterLab environment

Opening up your JupyterLab:

Hands On: Launch JupyterLab

Currently JupyterLab in Galaxy is available on Live.useGalaxy.eu, usegalaxy.org and usegalaxy.eu.

Hands On: Run JupyterLab
  1. Interactive Jupyter Notebook. Note that on some Galaxies this is called Interactive JupyTool and notebook:
  2. Click Run Tool
  3. The tool will start running and will stay running permanently

    This may take a moment, but once the Executed notebook in your history is orange, you are up and running!

  4. On the left menu bar you should see the Interactive Tools Icon now. Click on it to open the Active Interactive Tools and locate the JupyterLab instance you started.
  5. Click on your JupyterLab instance (JupyTool interactive tool)

If JupyterLab is not available on the Galaxy instance:

  1. Start Try JupyterLab

You should now be looking at a page with the JupyterLab interface:

Jupyterlab default session.

As shown on the figure above, JupyterLab interface is made of 3 main areas:

  • The menu bar at the top
  • The left side bar with, in particular, the File Browser
  • The main work area in the central panel

Start your first notebook

Now that we are ready to start exploring JupyterLab, let’s open a python Notebook. There will be one pre-opened notebook available in the file browser on the left side called ipython_galaxy_notebook.ipynb. For this training, however, we will open a new Jupyter notebook, select a kernel and give it a name.

Hands On: Start a notebook
  1. Open a new Jupyter notebook

    There is more than one option to open a jupyter notebook. One option is:

    • Click on File in the top menu bar
    • Select New -> Notebook
    • Choose the kernel Python [conda env:python-kernel-3.12] from the dropdown menu and click select
  2. Change the kernel

    If the kernel Python [conda env:python-kernel-3.12] was chosen in the previous step, it should appear in the upper right corner of the notebook file. If not, this is the location where the kernel can be switched.

    • Click on the field that displays the current kernel e.g. Python [conda env:base]
    • Now select the kernel Python [conda env:python-kernel-3.12] from the drop down menu
  3. Name the Jupyter notebook

    There are several options how to name or rename your Jupyter Notebook. One way is to:

    • Click on File and select Save File As…
    • Enter your file name e.g. first_galaxy_notebook.ipynb. Note that your file needs to end with .ipynb
    • Click on Save

Install and import libraries using Conda

Some dependencies or programming libraries may not be available in the kernel your Jupyter Notebook is using. If that’s the case, you can install or update these libraries using Conda.

Hands On: Install from a Conda recipe
  1. Click on a cell of your notebook to edit it (verify that it is defined as a Code cell)
  2. Enter the following lines :
    !conda install -y pandas
    !conda install -y seaborn
    
    • The ! indicate you are typing a bash command line (alternatively you can add the line %%bash at the beginning of your cell. In that case the whole cell will be run as bash commands.)
    • The -y option allows the installation without asking for confirmation (The confirmation is not managed well by notebooks)
  3. shift+return to run the cell or click on the run cell button.

Now you will be able to import this Python libraries and use them with your library code.

Hands On: Install Python libraries using Conda
  1. Click on a cell of your notebook to edit it (verify that it is defined as a Code cell)

  2. Enter the following lines :
    import pandas as pd
    import seaborn as sns
    from IPython.display import display
    import matplotlib.pyplot as plt
    
  3. shift+return to run the cell or click on the run cell button.

If you wish to follow the Software Carpentries training Plotting and Programming in Python after finishing this Jupyter Notebook introduction, you should install the following Python libraries.

Hands On: Install Python libraries for a Python introduction
  1. Copy the following install commands and paste them into a empty cell of your Notebook
    !conda install -y math
    !conda install -y matplotlib
    !conda install -y glob
    !conda install -y pathlib
    
  2. press shift+return to run the cell or click on the run cell button.

Import data

If you want to include datasets from your history into your Jupyter notebook, you can import them using the get(12) command, with the number of your dataset in the history (If you are working on a collection, unhide datasets to see their numbers). You can use the gapminder_gdp_europe.csv file. You can save all of the files into a path location first, to later refer to this file using the import path variable.

Hands On: Import a file location from your history

Save the import file location to a variable (file_import) name with the get() function.

file_import = get("[file_number]")
  • The files are referenced in Jupyter by their number in the history.
  • The variable file_import now stores the location where your file can be imported from.

If you wish to follow the python training later you should import all of the gapminder datasets now. The file_import variables can be treated like the path with the tutorial.

Hands On: Import the training data locations from your history
  1. Find the file_numbers of all of the gapminder datasets (gapminder_all.csv, gapminder_gdp_africa.csv, gapminder_gdp_americas.csv, gapminder_gdp_asia.csv, gapminder_gdp_europe.csv, gapminder_gdp_oceania.csv) with your galaxy history
  2. Load the datasets file_importinto your Ipython notebook:
    gapminder_all_import = get("[file_number]")
    gapminder_gdp_africa_import = get("[file_number]")
    gapminder_gdp_americas_import = get("[file_number]")
    gapminder_gdp_asia_import = get("[file_number]")
    gapminder_gdp_europe_import = get("[file_number]")
    gapminder_gdp_oceania_import = get("[file_number]")
    
    • The files are referenced in Jupyter by their number in the history.
    • The variable file_import now stores the location where you file can be imported from

Graph Display in Jupyter

In this tutorial we are going to plot a distribution graph of our data. For this, we will first need to load one of our tabular data files. You can use the gapminder_all.csv file.

Hands On: Load a file from your history
  1. Open the dataset as a pandas DataFrame with the function.
    dataframe = pd.read_csv(file_import)
    
    • The files are referenced in Jupyter by their number in the history.
Hands On: Draw a distribution plot
  1. Create your figure with the command
    fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(15, 10))
    
    • nrows=1, ncols=1 means you will have one plot in your figure (one row and one column)
    • figsize parameter determines the size of the figure
  2. Draw the distribution plot of the second column of our dataset with the command
    sns.histplot(data=dataframe, x='gdpPercap_1997',  kde=True, ax=ax)
    

    Distribution plot in Jupyter.

Export Data

If you want to save a file you generated in your notebook, use the put("file_name") command. That is what we are going to do with our distribution plot.

Hands On: Save an Jupyter-generated image into a Galaxy History
  1. Create an image file with the figure you just drew with the command
    fig.savefig('distplot.png')
    
  2. Export your image into your history with the command
    put('distplot.png')
    

Save the Notebook in your history

Once you are done with your analysis or anytime during the editing process, you can save the notebook into your history using the put("first_galaxy_notebook.ipynb"). If you create additional notebooks with different names, make sure you save them all before you quit JupyterLab.

This will create a new notebook .ipynb file in your history every time you click on this icon.

Hands On: Closing JupyterLab
  1. In the Galaxy interface click on Interactive Tools button on the left side.

  2. Tick galaxy-selector the box of your Jupyter Interactive Tool, and click Stop.

If you want to reopen a Jupyter Notebook saved in your history, you can use the tool Interactive JupyterLab Notebook, select “Load a previous Notebook”, and select the notebook from your history.

Conclusion

trophy You have just performed your first analysis in Jupyter notebook integrated environment in Galaxy. You generated an distribution plot that you saved in your history along with the notebook to generate it. If you wish to follow the Software Carpentries training Plotting and Programming in Python training now, you can open a Jupyter notebook install all needed dependencies and upload all file locations from the Gapminder dataset using the get('file_number') function (e.g. gapminder_all_file = get(12)). You can use this file location now throughout the tutorial once you need to specify the file path. Meaning, if the tutorial ask you to load a dataset like this data_oceania = pd.read_csv('data/gapminder_gdp_oceania.csv')you can replace the path 'data/gapminder_gdp_oceania.csv'with our file_import variable: data_oceania = pd.read_csv(gapminder_gdp_oceania_import). You can start directly from The-jupyterlab-interface section of the tutorial.