Tutorial

Entering Data

The program accepts two data formats: a series of DNA sequences or a matrix containing calculated distances between DNA sequences.

The ‘DNA sequence’ option should be selected when a series of DNA sequences are to be compared without prior knowledge of the distance between each sequence. The data should be entered in the format: label:sequence, with each pair [label:sequence] being placed on a separate line, as demonstrated below.

 
a:GATCGTACCG
b:AATCACCGAG
c:GATCGTACCG
d:CATCGCGTAC
e:GATCGTACCA
f:TTACGTACCG
g:GGATCGTACC

There should be no spaces between any of the characters and each sequence should be of equal length. In order for the program to work correctly, a colon must be placed between the designated label and sequence.

The ‘Matrix’ option should be selected when the distance between each taxon is known. Each line entered should correspond to a row in the distance matrix and must be entered in the format: label distances, with the label and each distance value separated by a space and each pair [label distances] placed on a separate line, as demonstrated below.


a  0  7  7 13 14 11 12
b  7  0  2 12 13 10 11
c  7  2  0 12 13 10 11
d 13 12 12  0  5 10 11
e 14 13 13  5  0 11 12
f 11 10 10 10 11  0  3
g 12 11 11 11 12  3  0

In order for the program to run properly, the labels should not contain spaces, if spaces are needed use an underscore.

Tree Builder Background Tutorial Example Trees