Galaxy Monitoring with gxadmin
| Author(s) |  | 
| Reviewers |  | 
OverviewQuestions:
Objectives:
What is gxadmin
What can it do?
How to write a query?
Learn gxadmin basics
See some queries and learn how they help debug production issues
Time estimation: 30 minutesSupporting Materials:Published: Jan 28, 2019Last modification: Feb 1, 2024License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MITpurl PURL: https://gxy.io/GTN:T00009rating Rating: 4.7 (0 recent ratings, 3 all time)version Revision: 27
We will just briefly cover the features available in gxadmin, there are lots of queries that may or may not be useful for your Galaxy instance and you will have to read the documentation before using them.
It started life as a small shell script that Helena wrote because she couldn’t remember what Gravity was called or where it could be found. Some of the functions needed for things like swapping zerglings are still included in gxadmin but are highly specific to UseGalaxy.eu and not generally useful.
Since then it became the home for “all of the SQL queries we [galaxy admins] run regularly.” 
  
  Helena Rasche
 and 
  
  Nate Coraor
 often shared SQL queries with each other in private chats, but this wasn’t helpful to the admin community at large, so they decided to put them all in 
gxadmin and make it as easy to install as possible. We are continually trying to make this tool more generic and generally useful, if you notice something that’s missing or broken, or have a new query you want to run, just let us know.
Agenda
Comment: Galaxy Admin Training PathThe yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.
Step 1ansible-galaxy
Step 2backup-cleanup
Step 3customization
Step 4tus
Step 5cvmfs
Step 6apptainer
Step 7tool-management
Step 8reference-genomes
Step 9data-library
Step 10dev/bioblend-api
Step 11connect-to-compute-cluster
Step 12job-destinations
Step 13pulsar
Step 14celery
Step 15gxadmin
Step 16reports
Step 17monitoring
Step 18tiaas
Step 19sentry
Step 20ftp
Step 21beacon
Installing gxadmin
It’s simple to install gxadmin. Here’s how you do it, if you haven’t done it already.
Hands On: Installing gxadmin with Ansible
Edit your
requirements.ymland add the following:--- a/requirements.yml +++ b/requirements.yml @@ -28,3 +28,5 @@ version: 6.1.0 - name: usegalaxy_eu.rabbitmqserver version: 1.4.1 +- src: galaxyproject.gxadmin + version: 0.0.12If you haven’t worked with diffs before, this can be something quite new or different.
If we have two files, let’s say a grocery list, in two files. We’ll call them ‘a’ and ‘b’.
Code In: Old$ cat old
🍎
🍐
🍊
🍋
🍒
🥑Code Out: New$ cat new
🍎
🍐
🍊
🍋
🍍
🥑We can see that they have some different entries. We’ve removed 🍒 because they’re awful, and replaced them with an 🍍
Diff lets us compare these files
$ diff old new
5c5
< 🍒
---
> 🍍Here we see that 🍒 is only in a, and 🍍 is only in b. But otherwise the files are identical.
There are a couple different formats to diffs, one is the ‘unified diff’
$ diff -U2 old new
--- old 2022-02-16 14:06:19.697132568 +0100
+++ new 2022-02-16 14:06:36.340962616 +0100
@@ -3,4 +3,4 @@
🍊
🍋
-🍒
+🍍
🥑This is basically what you see in the training materials which gives you a lot of context about the changes:
--- oldis the ‘old’ file in our view
+++ newis the ‘new’ file- @@ these lines tell us where the change occurs and how many lines are added or removed.
- Lines starting with a - are removed from our ‘new’ file
- Lines with a + have been added.
So when you go to apply these diffs to your files in the training:
- Ignore the header
- Remove lines starting with - from your file
- Add lines starting with + to your file
The other lines (🍊/🍋 and 🥑) above just provide “context”, they help you know where a change belongs in a file, but should not be edited when you’re making the above change. Given the above diff, you would find a line with a 🍒, and replace it with a 🍍
Added & Removed Lines
Removals are very easy to spot, we just have removed lines
--- old 2022-02-16 14:06:19.697132568 +0100
+++ new 2022-02-16 14:10:14.370722802 +0100
@@ -4,3 +4,2 @@
🍋
🍒
-🥑And additions likewise are very easy, just add a new line, between the other lines in your file.
--- old 2022-02-16 14:06:19.697132568 +0100
+++ new 2022-02-16 14:11:11.422135393 +0100
@@ -1,3 +1,4 @@
🍎
+🍍
🍐
🍊Completely new files
Completely new files look a bit different, there the “old” file is
/dev/null, the empty file in a Linux machine.$ diff -U2 /dev/null old
--- /dev/null 2022-02-15 11:47:16.100000270 +0100
+++ old 2022-02-16 14:06:19.697132568 +0100
@@ -0,0 +1,6 @@
+🍎
+🍐
+🍊
+🍋
+🍒
+🥑And removed files are similar, except with the new file being /dev/null
--- old 2022-02-16 14:06:19.697132568 +0100
+++ /dev/null 2022-02-15 11:47:16.100000270 +0100
@@ -1,6 +0,0 @@
-🍎
-🍐
-🍊
-🍋
-🍒
-🥑
Install the role with:
Code In: Bashansible-galaxy install -p roles -r requirements.yml
Add the role to your playbook:
--- a/galaxy.yml +++ b/galaxy.yml @@ -34,3 +34,4 @@ - galaxyproject.nginx - galaxyproject.tusd - galaxyproject.cvmfs + - galaxyproject.gxadmin
Run the playbook
Code In: Bashansible-playbook galaxy.yml
With that, gxadmin should be installed! Now, test it out:
Hands On: Test out gxadmin
Run
gxadminas the galaxy user and list recently registered users:Code In: Bashsudo -u galaxy gxadmin query latest-usersCode In: Outputid | create_time | disk_usage | username | email | groups | active ----+-------------------------------+------------+----------+--------------------+--------+-------- 1 | 2021-06-09 12:25:59.299651+00 | 218 kB | admin | admin@example.org | | f (1 rows)
1.sh
Configuration
If psql runs without any additional arguments, and permits you to access your galaxy database then you do not need to do any more configuration for gxadmin.
Otherwise, you may need to set some of the PostgreSQL environment variables
Overview
gxadmin has several categories of commands, each with different focuses. This is not a technically meaningful separation, it is just done to make the interface easier for end users.
| Category | Keyword | Purpose | 
|---|---|---|
| Configuration | config | Commands relating to galaxy’s configuration files like XML validation. | 
| Filters | filter | Transforming streams of text. | 
| Galaxy Admin | galaxy | Miscellaneous galaxy related commands like a cleanup wrapper. | 
| uWSGI | uwsgi | If you’re using systemd for Galaxy and a handler/zergling setup, then this lets you manage your handlers and zerglings. | 
| DB Queries | {csv,tsv,i,}query | Queries against the database which return tabular output. | 
| Report | report | Queries which return more complex and structured markdown reports. | 
| Mutations | mutate | These are like queries, except they mutate the database. All other queries are read-only. | 
| Meta | meta | More miscellaneous commands, and a built-in updating function. | 
Admin Favourite Queries
  
  Simon Gladman
‘s favourite: 
gxadmin query old-histories. He contributed this function to find old histories, as their instance has a 90 day limit on histories, anything older than that might be automatically removed. This helps their group identify any histories that can be purged in order to save space. Running this on UseGalaxy.eu, we have some truly ancient histories, and maybe could benefit from a similar policy.
Code Ingxadmin query old-histories
Code Out
id update-time user-id name published deleted purged hid-counter 361 2013-02-24 16:27:29.197572 xxx xxxx Unnamed history f f f 6 362 2013-02-24 15:31:05.804747 xxx xxxx Unnamed history f f f 1 347 2013-02-22 15:59:12.044169 xxx xxxx Unnamed history f f f 19 324 2013-02-22 15:57:54.500637 xxx xxxx Exercise 5 f f f 64 315 2013-02-22 15:50:51.398894 xxx xxxx day5 practical f f f 90 314 2013-02-22 15:45:47.75967 xxx xxxx 5. Tag Galaxy-Kurs f f f 78 
  
  Nate Coraor
‘s favourite: 
gxadmin query job-inputs. He contributed this function which helps him debug jobs which are not running and should be.
Code Ingxadmin query job-inputs 5 # Or another job ID
| hda-id | hda-state | hda-deleted | hda-purged | d-id | d-state | d-deleted | d-purged | object-store-id | 
|---|---|---|---|---|---|---|---|---|
| 8638197 | f | f | 8246854 | running | f | f | files9 | |
| 8638195 | f | f | 8246852 | running | f | f | files9 | |
| 8638195 | f | f | 8246852 | running | f | f | files9 | 
  
  Björn Grüning
‘s favourite: 
gxadmin query latest-users let’s us see who has recently joined our server. We sometimes notice that people are running a training on our infrastructure and they haven’t registered for training infrastructure as a service which helps us coordinate infrastructure for them so they don’t have bad experiences.
Code Ingxadmin query latest-users
| id | create_time | disk_usage | username | groups | active | |
|---|---|---|---|---|---|---|
| 3937 | 2019-01-27 14:11:12.636399 | 291 MB | xxxx | xxxx | t | |
| 3936 | 2019-01-27 10:41:07.76126 | 1416 MB | xxxx | xxxx | t | |
| 3935 | 2019-01-27 10:13:01.499094 | 2072 kB | xxxx | xxxx | t | |
| 3934 | 2019-01-27 10:06:40.973938 | 0 bytes | xxxx | xxxx | f | |
| 3933 | 2019-01-27 10:01:22.562782 | xxxx | xxxx | f | 
  
  Helena Rasche
‘s favourite 
gxadmin report job-info. This command gives more information than you probably need on the execution of a specific job, formatted as markdown for easy sharing with fellow administrators.
Code Ingxadmin report job-info 1
Code Out# Galaxy Job 5132146 Property | Value ------------- | ----- Tool | toolshed.g2.bx.psu.edu/repos/bgruening/canu/canu/1.7 State | running Handler | handler_main_2 Created | 2019-04-20 11:04:40.854975+02 (3 days 05:49:30.451719 ago) Job Runner/ID | condor / 568537 Owner | e08d6c893f5 ## Destination Parameters Key | Value --- | --- description | `canu` priority | `-128` request_cpus | `20` request_memory | `64G` requirements | `GalaxyGroup == "compute"` tmp_dir | `True` ## Dependencies Name | Version | Dependency Type | Cacheable | Exact | Environment Path | Model Class --- | --- | --- | --- | --- | --- | --- canu | 1.7 | conda | false | true | /usr/local/tools/_conda/envs/__canu@1.7 | MergedCondaDependency ## Tool Parameters Name | Settings --------- | ------------------------------------ minOverlapLength | 500 chromInfo | /opt/galaxy/tool-data/shared/ucsc/chrom/?.len stage | all contigFilter | {lowCovDepth: 5, lowCovSpan: 0.5, minLength: 0, minReads: 2, singleReadSpan: 1.0} s | null mode | -nanopore-raw dbkey | ? genomeSize | 300000 corOutCoverage | 40 rawErrorRate | minReadLength | 1000 correctedErrorRate | ## Inputs Job ID | Name | Extension | hda-id | hda-state | hda-deleted | hda-purged | ds-id | ds-state | ds-deleted | ds-purged | Size ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- 4975404 | Osur_record.fastq | fastqsanger | 9517188 | | t | f | 9015329 | ok | f | f | 3272 MB 4975404 | Osur_record.fastq | fastqsanger | 9517188 | | t | f | 9015329 | ok | f | f | 3272 MB ## Outputs Name | Extension | hda-id | hda-state | hda-deleted | hda-purged | ds-id | ds-state | ds-deleted | ds-purged | Size ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- Canu assembler on data 41 (trimmed reads) | fasta.gz | 9520369 | | f | f | 9018510 | running | f | f | Canu assembler on data 41 (corrected reads) | fasta.gz | 9520368 | | f | f | 9018509 | running | f | f | Canu assembler on data 41 (unitigs) | fasta | 9520367 | | f | f | 9018508 | running | f | f | Canu assembler on data 41 (unassembled) | fasta | 9520366 | | f | f | 9018507 | running | f | f | Canu assembler on data 41 (contigs) | fasta | 9520365 | | f | f | 9018506 | running | f | f |
  
  Catherine Bromhead
 contributed the ‘jobs’ query: 
gxadmin query jobs lets you list jobs that have been run on your Galaxy. It’s a lot more flexible than queue-overview and we suggest using it instead, in most places. E.g. to find circos jobs that were recently run:
Code Ingxadmin query jobs --limit 2 --tool circos
| job_id | create_time | update_time | user_id | state | tool_id | handler | destination | external_id | 
|---|---|---|---|---|---|---|---|---|
| 58483488 | 2023-04-04 18:42:40 | 2023-04-04 18:43:30 | error | toolshed.g2.bx.psu.edu/repos/iuc/circos/circos/0.69.8+galaxy8 | handler_sn06_5 | 1cores_10.0G | 42071736 | |
| 58483208 | 2023-04-04 18:36:24 | 2023-04-04 18:40:43 | error | toolshed.g2.bx.psu.edu/repos/iuc/circos/circos/0.69.8+galaxy9 | handler_sn06_3 | 1cores_10.0G | 42071486 | 
or to see recent jobs from a specific user (e.g. to help answer their email queries when they just send you a screenshot rather than a proper bug report)
Code Ingxadmin query jobs --limit 2 --user helena-rasche --terminal
| job_id | create_time | update_time | user_id | state | tool_id | handler | destination | external_id | 
|---|---|---|---|---|---|---|---|---|
| 58277473 | 2023-03-31 09:53:36 | 2023-03-31 09:53:36 | 580 | ok | TAG_FROM_FILE | default | ||
| 58277410 | 2023-03-31 09:47:16 | 2023-03-31 09:51:26 | 580 | ok | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.4 | handler_sn06_0 | 1cores_4.0G | 41859564 | 
gxadmin for Monitoring
gxadmin already supported query, csvquery, and tsvquery for requesting data from the Galaxy database in tables, CSV, or TSV formats, but we recently implemented influx queries which output data in a format that Telegraf can consume.
So running gxadmin query queue-overview normally shows something like:
Code Ingxadmin query queue-overview
| tool_id | tool_version | destination_id | handler | state | job_runner_name | count | 
|---|---|---|---|---|---|---|
| toolshed.g2.bx.psu.edu/repos/iuc/unicycler/unicycler/0.4.6.0 | 0.4.6.0 | 12cores_180G_special | handler_main_4 | running | condor | 1 | 
| toolshed.g2.bx.psu.edu/repos/iuc/unicycler/unicycler/0.4.6.0 | 0.4.6.0 | 12cores_180G_special | handler_main_5 | running | condor | 1 | 
| toolshed.g2.bx.psu.edu/repos/devteam/freebayes/freebayes/1.1.0.46-0 | 1.1.0.46-0 | 12cores_12G | handler_main_3 | running | condor | 2 | 
| toolshed.g2.bx.psu.edu/repos/iuc/qiime_extract_barcodes/qiime_extract_barcodes/1.9.1.0 | 1.9.1.0 | 4G_memory | handler_main_1 | running | condor | 1 | 
| toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy3 | 2.1.0+galaxy3 | 8cores_20G | handler_main_11 | running | condor | 1 | 
| toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72 | 0.72 | 20G_memory | handler_main_11 | running | condor | 4 | 
| ebi_sra_main | 1.0.1 | 4G_memory | handler_main_3 | running | condor | 2 | 
| ebi_sra_main | 1.0.1 | 4G_memory | handler_main_4 | running | condor | 3 | 
The gxadmin iquery queue-overview is run by our Telegraf monitor on a regular basis, allowing us to consume the data:
Code Ingxadmin iquery queue-overview
Code Outqueue-overview,tool_id=toolshed.g2.bx.psu.edu/repos/iuc/unicycler/unicycler/0.4.6.0,tool_version=0.4.6.0,state=running,handler=handler_main_4,destination_id=12cores_180G_special,job_runner_name=condor count=1 queue-overview,tool_id=toolshed.g2.bx.psu.edu/repos/iuc/unicycler/unicycler/0.4.6.0,tool_version=0.4.6.0,state=running,handler=handler_main_5,destination_id=12cores_180G_special,job_runner_name=condor count=1 queue-overview,tool_id=toolshed.g2.bx.psu.edu/repos/devteam/freebayes/freebayes/1.1.0.46-0,tool_version=1.1.0.46-0,state=running,handler=handler_main_3,destination_id=12cores_12G,job_runner_name=condor count=1 queue-overview,tool_id=toolshed.g2.bx.psu.edu/repos/devteam/vcffilter/vcffilter2/1.0.0_rc1+galaxy1,tool_version=1.0.0_rc1+galaxy1,state=queued,handler=handler_main_11,destination_id=4G_memory,job_runner_name=condor count=1 queue-overview,tool_id=toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy3,tool_version=2.1.0+galaxy3,state=running,handler=handler_main_11,destination_id=8cores_20G,job_runner_name=condor count=1 queue-overview,tool_id=toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.72,tool_version=0.72,state=running,handler=handler_main_11,destination_id=20G_memory,job_runner_name=condor count=4 queue-overview,tool_id=ebi_sra_main,tool_version=1.0.1,state=running,handler=handler_main_3,destination_id=4G_memory,job_runner_name=condor count=2 queue-overview,tool_id=ebi_sra_main,tool_version=1.0.1,state=running,handler=handler_main_4,destination_id=4G_memory,job_runner_name=condor count=3
And produce some nice graphs from it.
You can use an influx configuration like:
[[inputs.exec]]
    commands = ["/usr/bin/galaxy-queue-size"]
    timeout = "10s"
    data_format = "influx"
    interval = "1m"
This often requires a wrapper script, because you’ll need to pass environment variables to the gxadmin invocation, e.g.:
#!/bin/bash
export PGUSER=galaxy
export PGHOST=dbhost
gxadmin iquery queue-overview --short-tool-id
gxadmin iquery workflow-invocation-status
This data is not currently exposed, so, just try the queries. But it’s easy to add influx support when missing! Here is an example, we set the variables in a function:
fields="count=1" tags="tool_id=0"This means: column 0 is a tag named tool_id, and column 1 is a field (real value) named count. Here is an example that has multiple fields that are stored.
Summary
There are a lot of queries, all tailored to specific use cases, some of these may be interesting for you, some may not. These are all documented with example inputs and outputs in the gxadmin readme, and help is likewise available from the command line.
Hands On: Time to git commitIt’s time to commit your work! Check the status with
git statusAdd your changed files with
git add ... # any files you see that are changedAnd then commit it!
git commit -m 'Finished Galaxy Monitoring with gxadmin'
Comment: Got lost along the way?If you missed any steps, you can compare against the reference files, or see what changed since the previous tutorial.
If you’re using
gitto track your progress, remember to add your changes and commit with a good commit message!
Comment: Galaxy Admin Training PathThe yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.
Step 1ansible-galaxy
Step 2backup-cleanup
Step 3customization
Step 4tus
Step 5cvmfs
Step 6apptainer
Step 7tool-management
Step 8reference-genomes
Step 9data-library
Step 10dev/bioblend-api
Step 11connect-to-compute-cluster
Step 12job-destinations
Step 13pulsar
Step 14celery
Step 15gxadmin
Step 16reports
Step 17monitoring
Step 18tiaas
Step 19sentry
Step 20ftp
Step 21beacon
