|
This section starts with a seminar on the method and results as discussed in the article. On the next page we will start dreaming up loud about evolution, correlated mutations, signalling paths, and how EV analysis can perhaps even help predict structures from MSAs. |
|
|
Figure 12. In the second part of this seminar we explain how the deep-learning dimensionality reduction of the variability patters of MSA columns was performed. |
|
|
Figure 13. When deep-learning is mentioned as method to extract generalisations from large data, people tend to immediately think about (very big) neural networks. We used another, simpler method. |
|
|
Figure 18. This vector length reduction is done for 20->15, 15->10, 10->5, and 5->2. To do the training of the ~1000 parameter in the matrices efficiently, we need some nifty tricks. If all values fall between 0.0 and 1.0 then we can use cross-entropy to measure the dissimilarity between the original and reconstructed vectors. And that allows for very fast convergence. But: Since we want the matrix transformations to result in values that fall nicely between 0 and 1, we use two methods (batch-normalization and the sigmoid activation-function) to ensure this. Normally, the outputs of the matrix operations can have both negative and positive values and have bell curve distributions with varying centers and widths depending on the matrix used. The batch-normalization technique helps by transforming these distributions into bell curves with fixed center and width values. The Sigmoid activation-function, aka the logistic function, is a mathematical function that can transform values from standardized bell curves into values between 0 and 1. Another reason to use the Sigmoid function (or any other non-linear function) is to ensure that each layer of the neural network becomes non-linear, otherwise the 8 matrices could easily be reduced to a single matrix with the exact same result (which removes the benefit of having multiple layers). |
|
|
Figure 21. At the end we have four reduction matrices and four re-expansion matrices. And, of course, they will never lead to the perfect re-expansion shown in this figure :-) |
|
|
Figure 22. In this section we discuss the results of the dimensionality reduction to two dimensions if the 20D input vectors are sorted by residue frequencies pi. |
|
|
Figure 24. This 2D-plot is the same as in the previous figure, but now the dots are coloured as function of the sequence entropy. The result is clear. |
|
|
Figure 27. In experiment 2 we don′t ask what the variability pattern does, but what the amino acids do in 2D. |
|
|
Figure 28. If we are not interested in variability patterns but in patterns of observed exchanges between residue types, we can use almost the same method. Just don′t sort the 20 pi frequencies. |
|
|
Figure 29. Everything then goes just as in the previous section we go again 20->15, 15->10, 10->5, and 5->2 with reduction matrices that upon re-expansion keep us closest to what we started with. |
|
|
Figure 32. The positives cluster, and so do the negatives. His sits a bit inbetween as it can be positive and negative (and neutral). |
|
|
Figure 34. The special ones (Cys, Pro, Gly) cluster. |
|
|
Figure 37. Asp, Asn, Glu, and Gln are all four a bit similar. The relations: Asn->Asp and Gln->Glu where neutral residues become negatively charged indeed require the same vector in the 2D plot. |
|
|
Figure 38. And so do the relations Asn->Gln and Asp->Glu where in both cases the side chain gains one -CH2- group. |
|
|
Figure 39. And that brings us to the conclusions. |