+ - 0:00:00
Notes for current slide

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Notes for next slide



Introduction to Machine learning



last_modification Updated:   purlPURL: gxy.io/GTN:S00136

Tip: press P to view the presenter notes | arrow-keys Use arrow keys to move between slides
1 / 21

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

question Questions

  • What is machine learning?

  • Why is it useful?

  • What are its different approaches?

2 / 21

objectives Objectives

  • Provide the basics of machine learning and its variants.

  • Learn how to do classification using the training and test data.

  • Learn how to use Galaxy's machine learning tools.

3 / 21

Contents

  • What is machine learning?
  • Types of machine learning
  • Techniques for
    • Hyperparameter optimisation
    • Learning and evaluation of models
  • Various applications of machine learning
4 / 21

Machine learning

  • Learns patterns from data
  • Comprises of different fields
    • Linear algebra, statistics and probability
    • Programming
    • Data analysis
    • Visualization
  • Applicable to data from multiple fields - protein and DNA sequences, weather data, stock and house prices, images ...

Machine learning

5 / 21

Variants of ML

ML variants

6 / 21

Classification

  • Supervised learning
  • Learn/predict classes or targets
  • Find decision boundary
  • Linear and non-linear boundaries
  • Algorithms are classifiers
  • Examples
    • Tumor or no tumor
    • Rain or no rain
    • ...

Classification

7 / 21

Classification dataset

  • Breast tumor dataset - Features and target

Classification dataset

8 / 21

Regression

  • Supervised learning
  • Targets are real numbers
  • Find fitting curve
  • Linear or non-linear curves
  • Algorithms are regressors
  • Examples:
    • Temperature forecast
    • Stock/house prices
    • ...

Regression

9 / 21

Regression dataset

  • Body fat dataset - features and target

Ensemble model

10 / 21

Hyperparameter optimisation

  • Grid search
  • Random search

Hyperparameter optimisation

11 / 21

Learning and evaluation

  • K-fold cross-validation
  • Dataset in K equal parts
  • Part == fold
  • Learn on training set
  • Evaluate on validation set

Learning and evaluation

12 / 21

Learning and evaluation

  • Training and test sets
  • Learn on training set
  • Evaluate on test set

Learning and evaluation

13 / 21

Applications of machine learning

  • BioInformatics
    • Protein structure prediction
    • Drug response prediction
    • Biological age prediction
    • Biomedical image analysis
    • ...
  • Computer vision/image recognition
  • Natural language processing
  • Speech recognition
  • ...

Learning and evaluation

14 / 21

For additional references, please see tutorial's References section

16 / 21

Screenshot of the gtn stats page with 21 topics, 170 tutorials, 159 contributors, 16 scientific topics, and a growing community

17 / 21
  • If you would like to learn more about Galaxy, there are a large number of tutorials available.
  • These tutorials cover a wide range of scientific domains.

Getting Help

Galaxy Help

18 / 21
  • If you get stuck, there are ways to get help.
  • You can ask your questions on the help forum.
  • Or you can chat with the community on Gitter.

Join an event

Event schedule

19 / 21
  • There are frequent Galaxy events all around the world.
  • You can find upcoming events on the Galaxy Event Horizon.

keypoints Key points

  • Machine learning algorithms learn features from data.

  • It is used for multiple tasks such as classification, regression, clustering and so on.

  • Multiple learning tasks can be performed using Galaxy's machine learning tools.

  • For the classification and regression tasks, data is divided into training and test sets.

  • Each sample/record in the training data has a category/class/label.

  • A machine learning algorithm learns features from the training data and do predictions on the test data.

20 / 21

Thank You!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors!

Author(s) Anup Kumar avatar Anup Kumar
Reviewers Björn Grüning avatarTeresa Müller avatarMartin Čech avatarArmin Dadras avatar
Galaxy Training Network

Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.

21 / 21

question Questions

  • What is machine learning?

  • Why is it useful?

  • What are its different approaches?

2 / 21
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow