Getting your hands-on earth data
| Author(s) |  | 
| Reviewers |  | 
OverviewQuestions:
Objectives:
What is the Earth System?
What type of data is available?
Requirements:
Learn about the Earth System terminology
Learn about the different source of earth data
Learn about earth observations, reanalysis, and visualisation
Time estimation: 1 hourSupporting Materials:Published: Oct 4, 2024Last modification: May 10, 2025License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MITpurl PURL: https://gxy.io/GTN:T00457version Revision: 2
This tutotrial aims at familiarzing you with Earth Science and discovering the earth data available on Galaxy. The target audience is not a scientist but anyone interested in learning about Earth system.
AgendaIn this tutorial, we will cover:
🌍🌏🌎
The Earth System is a complex and dynamic system that encompasses the interactions between the atmosphere, hydrosphere, geoshpere, and biosphere. Understanding and analyzing data from the Earth System Model is essential, for example to predict and mitigate the impacts of climate change. Currently, Galaxy Earth System uses 4 usecases to have an overview of an earth System: ocean, land, atmosphere, and biodiversity. Of course, these components cross over each other thus, we have data that can fit two or more compartiments.
Next we will see for each compartiment which kind of data (and how to get them) are availabe on Galaxy Earth System
🌊 Ocean 🌊
The ocean is a key component of the Earth’s climate system. Thus, it needs continuous real-time monitoring to help scientists better understand its dynamics and predict its evolution.
The Argo program
All around the world, oceanographers have managed to join their efforts and set up a Global Ocean Observing System among which Argo is a key component. Argo is an international program that collects information from inside the ocean using a fleet of robotic instruments that drift with the ocean currents and move up and down between the surface and a mid-water level.
The data available here are from the Argo gliders network. They contain the folowing variables:
- latitude, longitude, and time
- physcical parameters: water temperature, pressure, and salinity
- biogeochemical parameters: chlorophyll, oxygen, and nitrate
To access those data you can use the ( Argo data access ( Galaxy version 0.1.15+galaxy0))tool.
For more information follow the tutorial related to it called Analyse Argo data or use the workflow Full Analyse Argo data https://earth-system.usegalaxy.eu/u/marie.josse/w/full-analyse-argo-data
Copernicus Data Space Ecosystem
The Copernicus Data Space Ecosystem provides access to Earth observation data. The goal of the Sentinel-3 mission is to accurately and reliably measure sea surface topography, sea and land surface temperature, and ocean and land surface color. This data will support ocean forecasting systems, environmental monitoring, and climate monitoring.
To access those data you can use the Copernicus Data Space Ecosystem tool.
For more information follow the tutorial related to it called Sentinel 5P data visualisation.
Once you get into the Jupyterlab go in the left pannel in samples > openeo > Sentinel_3.ipynb . Yhis way you have an example on how to get Sentinel 3 data.
EMODnet Chemistry
EMODnet Chemistry provides easy access to marine chemical data, standardised, harmonised and validated data collections and data products for visualisation. These are relevant for the European marine environmental policies to assess the status and trends of ecosystems. Numerous substances are considered, most of which are invisible to the naked eye and are monitored using specialised sensors or laboratory analyses. This evidence-based information supports action against environmental changes that pose a threat to marine ecosystems and human health.
A focus on ODV (Ocean Data View) collections
This webODV service facilitates to explore, subset, visualize, and extract data sets in multiple formats from the harmonized, standardized, validated data collections that EMODnet Chemistry is regularly producing and publishing for all European sea basins for eutrophication and contaminants. You can also find these collections on the Sextant catalogue
You can analize thes kind of data through the Ocean Data View (ODV) tool.
For more information follow the related tutorials and workflows:
- Ocean’s variables study and the corresponding workflow  https://usegalaxy.eu/u/marie.josse/w/oceans-var-v2
- Ocean Data View (ODV)
Copernicus Marine Data Store
The Copernicus Marine Data Store offers a ocean product catalogue, to download or visualize data accross nearly 15variables, including hindcast, current and forecast data.
A tool is currently being develop to enable easy access to this data. More information to come.
🏜️ Land 🏞️
This component comprehend all from the solid part of earth land, rock, volcano activities to the deragation of the land surface. This compartiment amis at assessing land and soil degradation, such as erosion, loss of organic matter and soil biodiversity, compaction, salinization, landslides, contamination, sealing, desertification and climate change.
Copernicus Data Space Ecosystem
The Copernicus Data Space Ecosystem provides access to Earth observation data.
The Copernicus Sentinel-2 mission consists of two polar-orbiting satellites that are positioned in the same sun-synchronous orbit, with a phase difference of 180°.
It aims to monitor changes in land surface conditions. The satellites have a wide swath width (290 km) and a high revisit time. This capability will support monitoring of changes on the Earth’s surface.
The Sentinel-2 twin satellites provide image data that largely contribute to existing and ongoing multispectral observations.
These satellites support a variety of services and applications offered by Copernicus, including:
- Land monitoring
- Agriculture
- Emergency management
- Risk mapping
- Security
- Forestry
- Climate change
- Disaster control
- Marine
- Humanitarian relief operations
The Scene Classification algorithm uses the reflective properties of scene features to establish the presence or absence of clouds in a scene. It is based on a series of threshold tests that use as input: TOA reflectance of several Sentinel-2 spectral bands, band ratios and indexes like Normalised Difference Vegetation Index (NDVI) and Normalised Difference Snow and Ice Index (NDSI). For each of these threshold tests, a level of confidence is associated. It produces at the end of the processing chain a probabilistic cloud mask quality indicator and a snow mask quality indicator. The most recent version of the SCL algorithm includes also morphological operations, usage of auxiliary data like Digital Elevation Model and Land Cover information and exploit the parallax characteristics of Sentinel-2 MSI instrument to improve its overall classification accuracy.
The NDVI ratio can be determined from the contribution of visible wavelength and near-infra-red wavelengths. Strong and well-nourished vegetation will absorb most of the visible wavelengths that it receives and will reflect back a large proportion of the near-infra-red light, whereas poor condition vegetation or thin areas, will reflect more visible wavelength light and less near-infra-red light.
To access those data you can use the Copernicus Data Space Ecosystem tool.
For more information follow the tutorial related to it called From NDVI data with OpenEO to time series visualisation with Holoviews.
QGIS (Geographical Information System)
Based on a QGIS official tutorial, you can learn how to access, filter and import GIS data through WFS web service using QGIS Galaxy interactive tool
In the Geographical Information System landscape, there is existing standards to help users deal with remote data. The most common web services are Web Map Services (WMS) and Web Feature Services (WFS). If WMS allows users only to access and display maps stored remotely, WFS is giving access to the features of data so you can modify it and create your own data and maps.
To learn more on how to do that folow QGIS Web Feature Services
☁️ Atmosphere 🌫️
This component represents the layer of gas around the Earth.
Copernicus Data Space Ecosystem
The Copernicus Data Space Ecosystem provides access to Earth observation data. The Copernicus Sentinel-5 Precursor mission is the first Copernicus mission that focuses on monitoring our atmosphere. It is the result of a close partnership between ESA, the European Commission, the Netherlands Space Office, industry, data users and scientists.
To access those data you can use the Copernicus Data Space Ecosystem tool.
Volcano 🌋
You can get atmospheric data on volcanic activity by taking a subset of Sentinel 5P L2 data for instance, from the 1st of April to the 30th of may 2021 of the Antilles. Especially of the La Soufriere Saint Vincent (Antilles) where a volcaninc erruption occured 9th of April. This dataset is focused on the dioxide sulfur (SO2) and Aerosol index (AI) spread out. Indeed, the knowledge of volcanic activity at a high temporal resolution is crucial for robustly determining large-scale impacts of volcanoes on atmosphere (air quality, air traffic) and climate. As such, this platform will be also of interest for scientists involved in the field of volcanic impacts at large, including institutions in charge of the monitoring of air quality and aviation safety.
For more information follow the tutorial related to it called Sentinel 5P data visualisation.
Climate Data Store
Dive into this wealth of information about the Earth’s past, present and future climate with the ( Climate Data Store ( Galaxy version 0.2.0)). With this tool you can have acces to atmospheric data such as ERA5 data. ERA5 provides a snapshot of the atmosphere, land surface and ocean waves for each hour from 1959 onwards (for 1950 to 1958 a preliminary version is available). It includes an uncertainty estimate which highlights the considerable evolution of the observing system, on which reanalysis products rely.
🐙 Biodiversity 🐿️
This englobes the part of the land, ocean, and athmosphere in which organisms are able to live.
Marine biodiversity 🐳
European marine microbial biodiversity is a poorly understood and under-exploited component of continental biodiversity. This component aims to provide a set of tools for the analysis of spatial- and time-comparable marine microbial metagenomics data sets for the exploration of biodiversity and its correlations with environmental quality.
OBIS Ocean Biodiversity Information System
OBIS is a global open-access data and information clearing-house on marine biodiversity for science, conservation and sustainable development. In order to get occurences of marine biodiversity from OBIS you can use the ( OBIS occurences ( Galaxy version 0.0.2)) tool.
For more information follow the tutorial related to it called OBIS marine indicators or use the workflow Marine omics visualisation https://usegalaxy.eu/u/marie.josse/w/marine-omics-visualisation
Land biodiversity 🦤
GBIF (Global Biodiversity Information Facility, www.gbif.org) is for sure THE most remarkable biodiversity data aggregator worldwide giving access to more than 1 billion records across all taxonomic groups. The data provided via these sources are highly valuable for research. However, some issues exist concerning data heterogeneity, as they are obtained from various collection methods and sources (OBIS is part of GBIF).
To access those data you can use the ( Get species occurrences data ( Galaxy version 0.9.0)) tool.
For more information follow the tutorial related to it called Cleaning GBIF data for the use in Ecology.
Conclusion
Here you have an overview of what’s is currently available in terms of data accessibility in Galaxy Earth System. This an on going work more is to come !
Extra information
Coming up soon even more tutorials on and other Earth-System related trainings. Keep an galaxy-eye open if you are interested!

 
                 
			
		 
			
		