|
We asked the question what would be the best way to reduce the dimensionality of the data in MSAs in general. And after doing the data-reduction we realised that the two dimensional representation matches rather well with EV plots. |
We decided that the MSAs for ~27K human sequences in the HSSP database would make for a large enough, diverse enough, and representative dataset.
|
Figure 4. The method to reduce the 20-dimensional alignment data to just two dimensions is explained in the movie. (Click on the image to start the movie). |
The whole process of encoding and decoding can be done in two very different ways:
|
Figure 5. The result of encoding frequency vectors sorted by frequency. |
On the next page you find both a typed and a spoken seminar that both discuss these results.