+ - 0:00:00
Notes for current slide

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Notes for next slide



Genome annotation with Prokka



last_modification Updated:   purlPURL: gxy.io/GTN:S00065

Tip: press P to view the presenter notes | arrow-keys Use arrow keys to move between slides
1 / 14

Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.

Press P again to switch presenter notes off

Press C to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other.

Useful when presenting.

Requirements

Before diving into this slide deck, we recommend you to have a look at:

2 / 14

question Questions

  • How to annotate a bacterial genome?

  • How to visualize annoted genomic features?

3 / 14

objectives Objectives

  • Load genome into Galaxy

  • Annotate genome with Prokka

  • View annotations in JBrowse

4 / 14

Overview

  • What is genome annotation?

  • Tools for genome annotation

  • The tool "Prokka"

5 / 14
  • In these slides, we will learn what is genome annotation, and which tools can be used for genome annotation.
  • We will describe in detail a tool called Prokka.

What is Annotation?

  • Classifying and describing parts of the genome sequence

  • Annotations are biological or other features on a genome, e.g.

    • a ribosome binding site: a biological feature

    • a sequence of TTTTTT: may/may not be biological but could be interesting

  • We can name features by type and location, e.g. gene, pseudogene, repeat

  • We can hypothesise functions, e.g. antioxidant activity

6 / 14
  • Annotating a genome means positioning features along the sequence of a genome.
  • Those features can be anything one can find in a genome sequence: genes, but also binding sites for example.
  • When a feature, like a gene for example, is positioned, you can add information about its function.
  • This operation is named "functional annotation".

First: assemble the genome

annotation pic

7 / 14
  • Before annotating a genome, you need to assemble it.
  • If you get a high quality assembly, it will be easier to perform a good quality annotation.

Then: annotate

annotation pic

8 / 14
  • Once you have a good genome sequence, you can annotate it.
  • In this example, there a gene coding for a delta toxin.
  • There is a ribosome binding site in red, and the coding sequence of this gene is in green.

Annotation

annotation pic

9 / 14
  • For each feature annotated on a genome, you can get its position, its type, and some information about its function or how it is expressed.

How do we annotate? many different ways:

  • sequence: does it match known sequences in databases?

  • sequence structure: e.g. does it look like an exon (start and stop codons)?

  • use other data: e.g. do lab experiments to investigate biological function

10 / 14
  • You can annotate features by looking at similarities with known sequences from international databases.
  • Some tools annotate features on a genome by seeking motifs corresponding to known structure (for example gene or exon start or stop).
  • Some lab experiments can help annotate specific regions of a genome, even though it is often much more expensive than an automatic annotation.
  • The lab experiments can provide certainty about function, where automatic annotation is more of a guess.

Prokka

annotation pic

11 / 14
  • Prokka is a pipeline that runs several other tools to annotate prokaryotic genomes.
  • The input is the assembly of the genome in Fasta format.
  • Prokka runs Aragorn to annotate transfer RNAs.
  • Ribosomal RNAs are annotated with RNAmmer.
  • Infernal uses the Rfam database to annotate non coding RNAs.
  • Finally Prodigal annotated coding genes.
  • Each coding sequence is then compared to the SwissProt sequence database using Blast, and to TIGR and Pfam motif datases using Hmmer3.
  • SignalP is also run to detect signal peptides in each predicted coding sequence.
  • The final result of the whole Prokka pipeline is a set of GFF3, GBK ans ASN1 files.

More information

Galaxy Training Network Slides: Introduction to Genome Annotation

12 / 14
  • More information is available in the "Introduction to Genome Annotation" slides.

keypoints Key points

  • Prokka is a useful tool to annotate a bacterial genome.

  • JBrowse can be used to inspect the annotation of a genome.

13 / 14

Thank You!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors!

Galaxy Training Network

Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.

14 / 14

Requirements

Before diving into this slide deck, we recommend you to have a look at:

2 / 14
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow