Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.
Press P
again to switch presenter notes off
Press C
to create a new window where the same presentation will be displayed.
This window is linked to the main window. Changing slides on one will cause the
slide to change on the other.
Useful when presenting.
Presenter notes contain extra information which might be useful if you intend to use these slides for teaching.
Press P
again to switch presenter notes off
Press C
to create a new window where the same presentation will be displayed.
This window is linked to the main window. Changing slides on one will cause the
slide to change on the other.
Useful when presenting.
Before diving into this slide deck, we recommend you to have a look at:
The main reason we end up building a tree is that searching tree space for an optimal one takes too long when the number of taxa gets large.
n | # trees | yes, but how much is that really? |
---|---|---|
3 | 3 | enumerable by hand |
4 | 15 | enumerable by hand |
5 | 105 | enumerable by hand on a rainy day |
6 | 945 | enumerable by computer |
7 | 10395 | still searchable very quickly on computer |
8 | 135135 | a bit more than the number of hairs on your head |
9 | 2027025 | population of Sydney living west of Parramatta |
10 | 34459425 | ≈ upper limit for exhaustive searching; about the number of possible combinations of numbers in the National Lottery |
20 | 8.2×1021 | ≈ upper limit for branch-and-bound searching |
48 | 3.21×1070 | ≈ number of particles in the universe |
136 | 2.11×10267 | number of trees to choose from in the ``Out of Africa'' data¹ |
¹ Vigilant et al., 1991
NGC 2207 and IC 2163 galaxies colliding.
Source: https://commons.wikimedia.org/wiki/Commons:Featured_pictures/Astronomy
Some parts of a phylogeny can be confidently accepted: when there are two species much more similar to each other than they are to any other species, we can confidently say that they are likely to be each other's closest relatives, in the set of species of interest.
If molecular sequences evolve at a nice steady rate - the "molecular clock" hypothesis - and if there's neither too little nor too much change, this can be good enough.
Before we talk about building a tree from distances, we need to think about how distances are reflected by trees.
The most natural way to infer distances from trees is by adding the lengths of branches between each pair of nodes.
Such distances are called patristic distances (I don't know why).
Distance matrix:
d | e | f | g | h | |
---|---|---|---|---|---|
d | 0 | 0.07 | 0.1* | 0.1 | 0.1 |
e | 0.07 | 0 | 0.1 | 0.1 | 0.1 |
f | 0.1 | 0.1 | 0 | 0.04 | 0.06 |
g | 0.1 | 0.1 | 0.04 | 0 | 0.06 |
h | 0.1 | 0.1 | 0.06 | 0.06 | 0 |
Any ultrametric tree satisfies the three-point condition (for rooted trees): for any three tips x, y, z, the larger two pairwise distances of D(x, y), D(x,z), D(y,z) will be equal.
This model has one single parameter, assuming that the base frequencies are each 25%: πA=πG=πC=πT=0.25
The rate matrix looks like this:
where the asterisk is a short-hand to make the row-sums equal 0.
Under this model the expected number of substitutions between two sequences with a p-distance of p is
ˆd=−34ln(1−43p)
This model allows for variation in nucleotide frequencies πA,πG,πC,πT as well as different transition/transversion rates using parameter κ:
This model also allows for a correction to turn relative observed numbers of substitutions between the different bases into expected total number of substitutions between two sequences, but it is much more complex.
Neighbo[u]r-Joining solves this problem by accounting for the net divergence of node from the rest, so if distances are tree-like, even if they're not ultrametric, it will get the tree right.
The formula for net divergence, with n taxa (i.e., an n x n distance matrix) is
ri=1n−2∑j≠iD(i,j)
And the adjusted distance becomes
D∗(i,j)=D(i,j)−r(i)−r(j)
This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors!
Author(s) |
|
Reviewers |
|
Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.
Before diving into this slide deck, we recommend you to have a look at:
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |