How do I manage my Galaxy storage?
Now, it is possible to bring your own Storage to Galaxy for computation, storage, and archiving of your results. You can add more storage options to your account by following these steps:
- Click on your Username on top right part of the website and then click on
Preferences
. - From the middle panel, click on the
Manage Your Galaxy Storage
(previously calledStorage location
). - Click on the
+ Create
button on top of the page. Here, you get multiple options to connect various storage options to your account.
For all of the possible storage options, you should fill the following fields:
- In the
Name
section, give a name to your storage. This name will be used to choose the storage on Galaxy when you want to select a Storage usingUser preferences > Preferred Galaxy Storage
. - Optionally, you can provide a
Description
for this Storage. This is a note for yourself.
Hands-on: Choose Your Own TutorialThis is a "Choose Your Own Tutorial" (CYOT) section (also known as "Choose Your Own Analysis" (CYOA)), where you can select between multiple paths. Click one of the buttons below to select how you want to follow the tutorial
Select the Storage you like to add to your Galaxy account.
If you have an account in Onedata, you can use such an object store as a Storage for your Galaxy datasets; they will be stored in the Onedata space of your choice. The minimal supported Onezone version is 21.02.4. More information on Onedata can be found on Onedata’s website.
There are extensive tutorials for setting up and utilizing of OneData on Galaxy Training Network (GTN). At the moment, we have the following tutorials for Onedata on GTN:
- Getting started with Onedata distributed storage
- Onedata user-owned storage
- Setting up a dev Onedata instance
- Configuring the Onedata connectors (remotes, Object Store, BYOS, BYOD)
In short, you can connect your Galaxy account to an Onedata Storage as follows:
- In the
Onezone domain
field, please fill in the address to yourOnezone
domain. It could be something like “datahub.egi.eu”. - In case you want to disable validation of SSL certificates, you can use
Disable tls certificate validation?
option. However, we strongly recommend you to not use this option unless you know what your are doing. - Provide name of a space that Galaxy data will be stored on Onedata using
Space Name
. If there is more than one space with the same name, you can explicitly specify which one to select by using the format<space_name>@<space_id>
(for exampledemo@7285220ecc636075ae5759aec7ad65d3cha8f9
). - If you want to provide a path to store Galaxy data, you can use the
Galaxy root directory
field. If this field is empty, the data will be stored in the space’s root directory. - You should provide an
Access Token
to Galaxy for the Onedata space. Your access token, suitable for REST API access in a Oneprovider service. Must allow both read and write data access. - Click on
Create
.
Amazon’s Simple Storage Service (S3) is Amazon’s primary cloud storage service. More information on S3 can be found in Amazon’s documentation. You have to create a bucket to use in your AWS web console before using this feature.
- You have to provide an
Access Key ID
to be able to use AWS Storage on Galaxy. A security credential for interacting with AWS services can be created from your AWS web console. Creating an “Access Key” creates a pair of keys used to identify and authenticate access to your AWS account - the first part of the pair is “Access Key ID” and should be entered here. The second part of your key is the secret part called the “Secret Access Key”. Place that in the secure part of this form below. - Provide the AWS S3 Bucket to store your datasets in the
Bucket
field. - You should enter the second part of the key you created above,
Access Key ID
, in theSecret Access Key
section. Read more on access keys on AWS documentation. - Click on
Create
.
To setup access to your Azure Blob Storage within the Galaxy, follow the steps:
- Provide the name of your Azure Blob Storage account in the
Container Name
field. More information about container’s name could be found on the Microsoft documentation here. - Fill the
Storage Account Name
based on your account. More information is available on Microsoft website. - Please provide the account access key to your Azur Blob Storage account, using
Account Key
field. This is the documentation on Managing storage account access keys. - Click on
Create
.
For the setup you will need to generate HMAC Keys - these can be linked to your user or a service account. Additionally, you will need to define a default Google cloud project to allow Galaxy to access your Google Cloud Storage via the interfaces described in this FAQs.
- To connect Galaxy to your Google Cloud Storage, you have to generate HMAC Keys. You can use the information after generating the keys to fill the
Access ID
field. - Use the
Bucket
field to specify the name of bucket you have created to store your Galaxy data. Documentation for how to create buckets can be found in this part of the Google Cloud Storage documentation. - You will receive a
Secret Key
after you generated HMAC Keys. Secret Key should be 40 characters long and look something like the example used the Google documentation -bGoa+V7g/yqDXvKRqq+JTFn4uQZbPiQJo4pf9RzJ
. - Click on
Create
.
The APIs used to connect to Amazon’s S3 (Simple Storage Service) have become something of an unofficial standard for cloud storage across a variety of vendors and services. Many vendors offer storage APIs compatible with S3. Here, you can configure such service as a Galaxy storage as long as you are able to find the connection details and have the relevant credentials.
- Provide the
Access Key ID
. This is part of your access tokens or access keys that describe the user that is accessing the data. The Amazon documentation calls these an “access key ID”, the CloudFlare documentation describes these as “aws_access_key_id”. Internally to Galaxy, we often just call this the “access_key”. - Provide the
Bucket
name. The bucket to store your datasets in. How to setup buckets for your storage will vary from service to service but all S3 compatible storage services should have the concept of a bucket to namespace a grouping of your data together with. - Using the
S3-Compatible API Endpoint
, you should provide the endpoint URL for your storage service. It is also called “endpoint URL” in some services and the format varies based on the providers. For example, CloudFlare endpoint URL is something likejohn.r2.cloudflarestorage.com
and MinIO endpoint URL is similar tohttps://play.min.io:9000
. Secret Access Key
compliment yourAccess Key ID
to connect to the S3 compatible storage. The Amazon documentation calls these an “secret access key” and the CloudFlare documentation describes these as “aws_secret_access_key”. Internally to Galaxy, we often just call this the “secret_key”.- Click on
Create
.
You can pick the connected Storage for your analysis as follows:
- Click on your username. Click on
Preferences
.- Click on
Preferred Galaxy Storage
. Here, you can pick the Storage of your choice. The default option is Galaxy Storage.Instead of using a default storage location for your account, it is also possible to select it at different levels: per History, per Tool, and Workflow.
To set a Storage for a specific History, you should click on the Galaxy History Storage choice (galaxy-history-storage-choice) icon on the right panel. Then, select the added external storage as the preferred storage location for the History. If you execute a Workflow in this history, the all results of the workflow will be stored in the external storage (that you selected). To verify it, you can click on the Dataset details icon (details) of a job on the right panel and you can see that the user’s external storage is used as the “Dataset Storage”.
Of course, if instead of a workflow, you can run just one tool using your connected Storage. To do this, you have to set the Galaxy History Storage choice (galaxy-history-storage-choice) as described above. Then, you can run one (or more) tool in this history and the results will be available on your Storage.