The scientific question that we address in this tutorial is the following: Does the 3D-"structure" of the amino acid Lysine in aqueous solution depend on the protonation state of the amino group of its side chain? Before you start to work technically on the answer to the question, you may think a bit more about the question and its implications.
Question 48:
1) What is meant by the 3D-"structure" of Lys? Is this notion
reasonable here?
2. What answer to our scientific question would you expect?
3. Could you describe the answer in terms of the two distances dNN
(atom N to atom Nζ ),
and dCN (atom C to
atom Nζ )?
See the figure below.
4. Which factors could influence the answer? How about T, ion
concentration, force field, simulation length, ... ?
5. Do you expect the protonation to be also dynamic?
![]() |
Figure 45: Definition of two distances that can be used to characterize the structure of Lys. |
In this tutorial we will work with the molecular dynamics (MD) program suite gromacs. Gromacs is not a monolithic program, but a large collec- tion of programs for all steps of a computational experiment with MD, including preparation, running MD simulations, and analysis. We use it since it is a fast, versatile and free system of open source tools. It is well-documented and used by a large and lively community. If you are interested in using gromacs, you can access everything (program, docu- mentation) at http://www.gromacs.org. If you are using Linux, there is a good chance that gromacs is part of your Linux distribution and can be installed with a key-click. In this practical, gromacs should have been installed already. In particular, we assume that we are working with gro- macs 4.5 (other versions may require adaptations of the scripts). In this course we will also use YASARA for MD purposes, but not today because YASARA is already too automatic for educational purposes. You will notice that we will not use a graphical interface but control the experiment by shell scripts, i.e. series of commands that can be studied and changed transparently. Additionally, we can use tools such as xmgrace or R for plotting, and any molecular viewer that can read and render PDB files and measure atomic distances. Here we will use vmd as an example. By now you should have already generated MD trajectories of your system (be it Lys with protonated or de-protonated side chain) with gro- macs using one of the shell script runAll.sh in one of the directories protonated or deprotonated. There should be several output files from that MD run in the chosen directory. In the following we assume that you have opened a window shell and have changed to that directory. If that is the case and you type ls you should obtain the list of files, e.g. lys.pdb, solvated.gro, etc.
The following files in your directory protonated or deprotonated make up the input:
When your MD run is complete, you should find the two files plotEnergies.pdf and plotDistances.pdf in the chosen directory, and when you open them with a PDF-viewer, they should show meaningful plots. If the two PDF files are not present, the run has not (yet) finished. If they are present but unreadable or empty, the run has failed.
All steps of the computational experiment are listed in the files runAll.sh in directories protonated and deprotonated, respectively. Most of the steps essentially call a gromacs program with certain input files and op- tions, and generate from this input output files. For instance in file runAll in directory protonated you will find:
pdb2gmx -lys -f lys.pdb -o lys.gro <<EOF >& pdb2gmx.out 9 1 1 EOF |
The gromacs command pdb2gmx translates our initial structure in file lys.pdb into a slightly different format ("gro"-file) needed by gromacs. We have to provide to pdb2gmx also information about the force field we are intending to use ("9" means that we will use the 9th force field from a list of several force fields that are available), and about the water model we are intending to use (the first "1" tells gromacs that we would like to use the first water model from a list). Additionally, we ask with the flag -lys the program pdb2gmx to inquire from the user the protonation state of our Lys side chain; the last "1" says that the Lys should be protonated. If you call pdb2gmx not from a shell script but interactively, you will see these selection lists and have to manually type 9 and 1 and 1 on the keyboard. The <<EOF tells pdb2gmx do not wait for manual input, but take instead the data on the following lines in the script until EOF (=End Of File) occurs. Finally, >& pdb2gmx.out sends output of pdb2gmx, such as error messages or warnings, to the file pdb2gmx.out. We could thus check whether the command has finished successfully.
In the last part of runAll.sh you can see that the production trajectory and the energies are extracted from the respective files. These data are put into .xvg files that can be plotted with the xmgrace program. Here, we have instead used an R script to do all plotting automatically. The results are found in files plotEnergies.pdf and plotDistances.pdf. You should see something as in the next two figures.
Question 49:
What happens to the total energy in the course of the equilibration?
What could be the purpose of the equilibration?
What happens to the total energy in the course of the production run?
Compare your results with your colleagues from the protonated and
deprotonated groups! Where do the average distances dNN and dCN
lie in each group? You can find out by using xmgrace. Type:
xmgrace distCN.xvg
Then choose menu Edit -> Data sets. Then click on the shown name
of the data set so that it is selected. In this way you obtain a list of
values below in the striped area called Statistics.
![]() |
Figure 46: Energy traces (energy over time) from equilibration and produc- tion MDs. Note the differently scaled axes in the two graphs. |
![]() |
Figure 47: Traces of distances dNN and dCN from production run (top panels), and corresponding histograms (bottom panels). |
![]() |
Figure 48: Distance data loaded into xmgrace. The striped area shows statistical measures of the distance trace, e.g. the mean distance. |
In the last point of the previous task we have extracted data from our tra- jectories that can be used to address our initial scientific question: How does protonation affect the distances dNN and dCN? The figure below shows boxplots that summarize the data.
Question 50:
How would you interpret your data or the boxplots?
In which way should we improve our computational experiment(s)?
Do we have ions included in the computational experiments? If yes,
what would be the effect of removing these ions?
The trajectories can be analyzed numerically, but sometimes it is also in- structive to watch what is going on. A trajectory is nothing but a movie, and you can watch that movie with a viewer, such as vmd etc. Alternatively, the tool ngmx is a part of the gromacs suite and can also be used to visualize movies, albeit in a simpler fashion.
![]() |
Figure 50: This should be shown by vmd once you have loaded molecule and trajectory, and picked two atoms (here Nζ and C of Lys) to show their distance. |