+ - 0:00:00
Notes for current slide

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Notes for next slide



Galaxy from an administrator's point of view



last_modification Updated:   purlPURL: gxy.io/GTN:S00015

text-document Plain-text slides |

Tip: press P to view the presenter notes | arrow-keys Use arrow keys to move between slides
1 / 17

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Picture with 125+ platforms for using galaxy written and a number of screenshots of those galaxies.

2 / 17

Where can Galaxy run?

3 / 17

Choosing where to run

Resource Directory

Supported UseGalaxy.* Local Cloud
Moderate Data
Big Data 🤷‍♀️
Moderate Computations
Long/Expensive Computations 🤷‍
You want to share your Galaxy objects with others
All needed Tools are pre-installed 🤷‍♂️
Human Data / Proprietary Data
No network transfer of data
4 / 17

Reasons to Install Your Own Galaxy

5 / 17

Software Requirements

Required:

  • Galaxy is written in Python and depends on Python 3.8 or newer

Minimal production requirements:

6 / 17

Hardware Requirements

This depends:

  • What do you intend to run?
  • Where do you intend to run it?

If possible, run the Galaxy server separate from Galaxy jobs

Storage will usually be the biggest concern

7 / 17
  • Depends on your available infrastructure
  • If you are storage limited, can be addressed by policy of deleting old/unused histories
  • If you are compute limited, can be addressed with queue limits

Server Hardware Requirements

Based on concurrent user count and assuming separate compute for jobs:

Users Resource estimate
1 - 5 1 core, 1GB, 10 TB
5 - 20 2 cores, 2 GB, 40 TB
20 - 40 8 cores, 8 GB, 200 TB
40+ multiple hosts, 16 GB/host, 500 TB, dedicated DB host

Storage is more difficult to estimate since it is, like compute, analysis and policy dependent

8 / 17

Galaxy Storage Philosophy

  • Foster transparency and reproducibility
  • Data is always created, never overwritten
  • Copying history or library datasets associate them with the original file on disk without an actual copy
  • By default, data is never really deleted unless explicitly instructed
  • Even deleted data can be undeleted unless forcibly purged
9 / 17

Storage Requirements

An "average" 2018 NGS analysis (by Anton Nekrutenko): 66 GB

10 users, 10 histories: > 6 TB

Solutions:

  • Quotas
  • Set job limits in the job conf
  • Clean up deleted data (with a cron job)
  • Forced removal based on age
  • Users can configure their workflows to delete intermediate tool outputs
  • Data libraries for common data
  • Public servers: require email verification (and watch for duplicates)
  • Plug in more/heterogeneous storage using Object Store configuration
10 / 17

Compute Requirements

This depends:

  • What tools will your users be using?
    • What are their requirements?
  • In general, the most commonly used tools use a single core
    • But can use lots of memory!
  • Some compute-intensive tools use multiple cores

usegalaxy.org allocates from 8 GB/core to 16 GB/core

Connecting Galaxy to clusters/HPC is covered in the advanced section.

11 / 17

Making plans

Before deploying your first Galaxy server:

  • Figure out where Galaxy will be stored
    • Make sure it will be accessible to any eventual compute
  • Figure out where data will be stored
    • Make sure it will be accessible to any eventual compute
12 / 17

Galaxy deployment options

  • As developer: git clone https://github.com/galaxyproject/galaxy.git
  • Ready-to-use locally: Docker
  • Production server: configuration management (e.g. Ansible)

In future:

  • Alternative to git clone: Galaxy wheel in PyPI
13 / 17

Deployment Best Practices

  • Use configuration management

    • Ansible for which Galaxy Project maintains roles (tutorial)
    • Other systems are possible (Chef, Puppet, SaltStack, CFEngine) but do not have project-maintained roles.
  • Whether you use configuration management or not, record every change you make on a version control system (e.g. git):

    • Large, complex deployments grow organically
    • If you don't know what you did, you can't do it again
14 / 17

System Administration Best Practices

  • Take security seriously
  • Update Galaxy when security updates are released
  • Follow OS security best practices
  • Privilege separate code/job/data ownership
  • Write protect Galaxy and data if you can
  • Read-only cluster mounts

Back up everything (except that which is managed by configuration management)

15 / 17
16 / 17

Thank You!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors!

page logo

Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.

17 / 17

Picture with 125+ platforms for using galaxy written and a number of screenshots of those galaxies.

2 / 17
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow