Phylogenetics
Here we show you how you can select sequences, turn them into a multiple sequence alignment, and then turn this alignment into a tree (well, okay, a diagram that looks like a tree). We show you three methods:
ClustalW: This is an easy method that you must use as a black box. You
cannot really control ClustalW, but it is reassuring to know that it uses
Neighbor Joining, a fairly well-established tree method.
Phylip: A more sophisticated method that enables you to control every
parameter that plays a part in the making of your tree.
phyML: A very accurate reconstruction method that uses maximum likelihood, a method that specialists often consider to be the most accurate.As its name implies, Maximum Likelihood methods try to reconstruct
the tree that more likely to correspond to your alignment.
Finding Out What Phylogenetic Trees Can Do for You:
The purpose of phylogeny is to reconstruct the history of life and explain the present diversity of living creatures. This can be represented as a huge genealogic tree (the tree of life). The underlying principle of phylogeny is to try to group living creatures according to their level of similarity. In this context, we assume that the more similar two species are (such as human and ape), the closer they are to their common ancestor. Phylogeny is not a new subject, and whether you trace back the birth of modern biology to Darwin and The Origin of Species or to Aristotle and his notion of categories, you can’t escape this daunting fact: Biology is very much about classifying and the best means of classification we have is phylogeny.
Phylogenetics is a special kind of phylogeny that relies on the comparison of
equivalent genes coming from several species for reconstructing the genealogic
tree of these species and finding out who is the closest relative of whom in
the family. If necessary, you can also apply phylogenetic methods to the various genes of a gene family to reconstruct the history of the gene family by
the same means. (Take note that these trees make sense only if you believe in
evolution!)
The molecular stories you can uncover with phylogenetics are incredibly
rich. In fact, we find it difficult not to see a parallel between the destiny of
famous families (such as the Medicis, the Borgias, or the Kennedys) and the
fates of gene families. When we unravel the chain of events that makes the
story of a protein family, we find tales of mutations and deletions, duplications or speciation, loss and gain of function, inactivation, and all the other
traumatic events that shaped the world as it is today. Nothing is ever taken
for granted while life evolves!
Phylogenetics is here to let you discover all this. Phylogenetics is not a technique, nor even a discipline; phylogenetics is a science and a major one.The mere task of laying out the general ideas of phylogenetics is way beyond the scope of this chapter, so if this subject is new to you, we urge you to consult some of the excellent textbooks available on this subject.
In the context of bioinformatic analysis, there are three major reasons why
you may want to use phylogenetics:
Determining the closest relatives of the organism that you’re interested in: For instance, if you’re studying a new bacterium, you can
sequence its ribosomal RNA and place it on a phylogenetic tree com-
puted with all known ribosomal RNAs. This can give you a fairly good
idea of who this bacterium really is.
Discovering the function of a gene: If you’re studying a gene, you can
use phylogenetic trees to be sure that the gene you’re interested in is orthologous (more about that in a minute) to another well-characterized
gene in another species.
Retracing the origin of a gene: Most genes within a genome travel
together through evolutionary time. However, from time to time, individ-
ual genes may jump from one species to another — for instance, piggy-
backing a virus infection. Phylogenetic trees are a great way to reveal
such events, which are called horizontal (or lateral) transfers.
Choosing sequences to make either a gene tree or a species tree
Homologous genes are genes that derive from a common ancestor. They can
have three types of relationships:
Orthologs: They’re only separated by speciation — is the phenomenon during which a common ancestor gives birth to two subgroups that
slowly drift away from their common genetic makeup to become distinct
species. Assuming that the genomes are not rearranged in the two new
species, two genes are orthologous when they correspond to the same
ancestral gene in the ancestral genome. Biologists usually expect
orthologs to have similar functions and structure.
Paralogs: Paralogs are homologues separated by a duplication event,
meaning that within a genome, a gene was duplicated. One of the duplicates may have kept the original function while the other duplicate
could have acquired a new function. You can expect paralogs to have different but related functions.
Xenologs: Xeno is a Greek word that means “foreigner.” Xenologs result from a lateral transfer between two organisms — a direct DNA transfer between two species. This means that one of the species contains a gene that does not have the same history as the genome in which it is inserted. A typical case of lateral transfer (or xenologs) is the acquisition of the isoleucyl-tRNA sytnthase from their host by several bacteria. The isoleucyl tRNA sytnthase is a protein involved in the synthesis of other proteins, and its acquisition by bacteria seems to help them becoming antibiotic resistant. When this happens, the newly acquired isoleucyl-tRNA sytnthase is a xenolog of the other tRNA synthases contained in the bacteria.
When you select a group of homologous genes to make a phylogenetic tree,
you always make what biologists call a gene tree. It is a tree that tells the
story of the genes it contains.
If you select all the paralogous members of a large human gene family, your
gene tree tells the story of this gene family only. You can only use it to reconstruct the chain of duplications that led from one single ancestral gene to the
current situation.
If you select a group of genes that are all orthologous from different species,
the gene tree you get looks very much like a species tree — which lets you
reconstruct the speciations that occurred while the species you’re looking at
(or their ancestors) were diverging. The best example of this type of gene
tree is the ribosomal RNA phylogenetic tree that biologists use to reconstruct
the big tree of life. Ribosomal RNA genes exist in every species and are
clearly orthologous between species.
