• CLUSTAL

EU name: CLUSTA

(From: ../EUDIR ) (Date: Jan 27 17:59 ../EUDI)

After completing the "alignment" part of the Tools section you:
Know the definition of the FASTA format
Are capable of aligning two or more sequences

During this course we shall often align sequences. Sequence alignment is a tool to analyse what happened during evolution, but it is also a tool to see how information can be transferred from one molecule to another. During this course we will 'build' our own sequence alignment program, but that program will only align two sequences at a time. If you want to align many sequences, you need a special program. In this course we shall use MRS-align, but you can also use any CLUSTAL server you find in the internet (there are dozens of those).

How to use CLUSTAL or MRS-align

We will normally use MRS-align. But there are dozens of CLUSTAL servers in the internet. Occasionally we might want to use the EBI-CLUSTAL. Get help from the assistant when you need/want to use that version because it has a few more buzzers and bells.

Alignment software normally requires that you feed it sequences in the so-called FASTA-format:

Figure 38. Alignment sequence input example. (This window changes it appearance occasionally, and it is often different between different software implementations, but in general it looks roughly like this...)

The MRS alignment does not allow you to change any parameters (that is why we like it :-). Later, when you are doing your own research project, you might need to do an alignment with modified parameters. For now, you only should not forget to click the button!

Question 31: Get the three sequences for the first exercise (or get them from the 'Exercise files' listed under 'Miscellaneous') and put them in the right format in the alignment window. After a second you get the output. The diagnostics will be discussed at a later stage. The other options seem trivial and the output shows a poor alignment:

Figure 39. The stupid test alignment's output.

You see that these 'sequences' don't look very much like each other. Perhaps that is because Gert, Gert backwards, and Celia don't look very much like each other, or do you think there is another reason?

The alignment software will be used mostly near the end of this course, so we will not spend much time on it right now.

Question 32: Look up de serine protease with Swissprot-code BSSP4_HUMAN.
Which are the active site residues? Give their numbers and residue types.
Perform a BLAST search on Swissprot with this sequence.
Align the top 4 sequences from the BLAST output (hint: click the check-box for each sequence to be aligned, and after that click the bottom Align button).
Try to read the output. Observe for instance the conservation of the active site residues.

Answer