name: inverse layout: true class: center, middle, inverse
---
# Genome annotation with Prokka
Anna Syme
Torsten Seemann
Simon Gladman
last_modification
Updated:
purl
PURL
:
gxy.io/GTN:S00065
video-slides
Video slides
|
text-document
Plain-text slides
|
video
Recordings
video
Lecture (February 2021) - 3m
video
View All
Tip:
press
P
to view the presenter notes |
arrow-keys
Use arrow keys to move between slides
??? Presenter notes contain extra information which might be useful if you intend to use these slides for teaching. Press `P` again to switch presenter notes off Press `C` to create a new window where the same presentation will be displayed. This window is linked to the main window. Changing slides on one will cause the slide to change on the other. Useful when presenting. --- ## Requirements Before diving into this slide deck, we recommend you to have a look at: - [Introduction to Galaxy Analyses](/training-material/topics/introduction) --- ### <i class="far fa-question-circle" aria-hidden="true"></i><span class="visually-hidden">question</span> Questions - How to annotate a bacterial genome? - How to visualize annoted genomic features? --- ### <i class="fas fa-bullseye" aria-hidden="true"></i><span class="visually-hidden">objectives</span> Objectives - Load genome into Galaxy - Annotate genome with Prokka - View annotations in JBrowse --- <!-- To show speaker notes during a presentation: press c to clone the slides (opens a new window) on one window, press p to show speaker notes display the other window --> ### Overview - What is genome annotation? - Tools for genome annotation - The tool "Prokka" ??? - In these slides, we will learn what is genome annotation, and which tools can be used for genome annotation. - We will describe in detail a tool called Prokka. --- ### What is Annotation? - Classifying and describing parts of the genome sequence - Annotations are biological or other features on a genome, *e.g.* - a ribosome binding site: a biological feature - a sequence of TTTTTT: may/may not be biological but could be interesting - We can name features by type and location, *e.g.* gene, pseudogene, repeat - We can hypothesise functions, *e.g.* antioxidant activity ??? - Annotating a genome means positioning features along the sequence of a genome. - Those features can be anything one can find in a genome sequence: genes, but also binding sites for example. - When a feature, like a gene for example, is positioned, you can add information about its function. - This operation is named "functional annotation". --- ### First: assemble the genome ![annotation pic](../../images/contigs.png) ??? - Before annotating a genome, you need to assemble it. - If you get a high quality assembly, it will be easier to perform a good quality annotation. --- ### Then: annotate ![annotation pic](../../images/seq_annotation.jpg) ??? - Once you have a good genome sequence, you can annotate it. - In this example, there a gene coding for a delta toxin. - There is a ribosome binding site in red, and the coding sequence of this gene is in green. --- ### Annotation ![annotation pic](../../images/annotation_details.png) ??? - For each feature annotated on a genome, you can get its position, its type, and some information about its function or how it is expressed. --- ### How do we annotate? many different ways: - sequence: does it match known sequences in databases? - sequence structure: *e.g.* does it look like an exon (start and stop codons)? - use other data: *e.g.* do lab experiments to investigate biological function ??? - You can annotate features by looking at similarities with known sequences from international databases. - Some tools annotate features on a genome by seeking motifs corresponding to known structure (for example gene or exon start or stop). - Some lab experiments can help annotate specific regions of a genome, even though it is often much more expensive than an automatic annotation. - The lab experiments can provide certainty about function, where automatic annotation is more of a guess. --- ### Prokka ![annotation pic](../../images/prokka_pipeline.png) ??? - Prokka is a pipeline that runs several other tools to annotate prokaryotic genomes. - The input is the assembly of the genome in Fasta format. - Prokka runs Aragorn to annotate transfer RNAs. - Ribosomal RNAs are annotated with RNAmmer. - Infernal uses the Rfam database to annotate non coding RNAs. - Finally Prodigal annotated coding genes. - Each coding sequence is then compared to the SwissProt sequence database using Blast, and to TIGR and Pfam motif datases using Hmmer3. - SignalP is also run to detect signal peptides in each predicted coding sequence. - The final result of the whole Prokka pipeline is a set of GFF3, GBK ans ASN1 files. --- ### More information Galaxy Training Network Slides: [Introduction to Genome Annotation](../introduction/slides.html) ??? - More information is available in the "Introduction to Genome Annotation" slides. --- ### <i class="fas fa-key" aria-hidden="true"></i><span class="visually-hidden">keypoints</span> Key points - Prokka is a useful tool to annotate a bacterial genome. - JBrowse can be used to inspect the annotation of a genome. --- ## Thank You! This material is the result of a collaborative work. Thanks to the [Galaxy Training Network](https://training.galaxyproject.org) and all the contributors!
Authors:
Anna Syme
Torsten Seemann
Simon Gladman
Tutorial Content is licensed under
Creative Commons Attribution 4.0 International License
.