Exams
First some examples of the (in)famous amino acid test.
Here is the 2008 version with answers
. And here the
2007 version with answers
. And here the
2009 version with answers
.
And here the
2010 version
, and the
2010 version with answers.
And here the
2014-2015 version
, and the
2014-2015 version with answers.
The exam in 2005 (Vriends part), with answers typed
- Give three examples of stabilising mutations that mainly influence entropy.
To influence entropy and be stabilizing, the mutation should reduce the
freedom of the unfolded chain (the folded chain is mainly immobile
already, anyway, so you cannot do much with the folded chain).
Examples are then Lys -> Arg because Arg contains the highly rigid
guanidinium group. Obviously, Lys -> Arg will only work when the Lys/Arg
are involved in some salt-bridge. If the Lys/Arg side chain is fully exposed
at the surface and highly mobile, than folding/unfolding will not be
very different for the two situations. Other examples are Gly -> Xxx,
or Xxx -> Pro. These mutations tend to work very well in practice. The
concept is that is that upon folding the main energy gain is the entropy
(and of course also the enthalpy) of water that obtais its freedom because
it is no longer facing hydrophobic residues. But,... this comes at the cost
of loosing the freedom/mobility of the protein chain. If you reduce
the freedom of the protein chain in the unfolded form (in the folded form the
protein has hardly any freedom/mobility, no matter the sequence) the you loose
less freedom upon folding. Obviously, there are more examples, e.g., if you
mutate a hydrophobic residue in the core so that it nice fills a cavity
(Val -> Ile or also Gly -> are often used, and the latter even works
'double') then you gain a lot of entropy of water. You can also
insert an Asn or Asp at a place where it can form a hydrogen bond
with the own backbone (see also the Asp versus Glu question three questions
further down). See also the cysteine bridge question a few questions
further down.
- Give three examples of stabilising mutations that mainly influence enthalpy.
To influence the enthalpy, you need more interactions. Examples are the
introduction of a Asp, Glu, Lys, or Arg to form a salt bridge, or mutate a
Val -> Thr to form a hydrogen bond. More examples can be dreamt up
of course, like compensating the dipole moment of a helix, compensating
unbalanced charges at/near ions, etc. The summary is making an extra polar
interaction.
- What is a HSSP file, and why is it useful for stability engineering?
This is a multiple sequence alignment of all sequences in UniProt (and that
includes all sequences from the better known SwissProt) that BLAST picks up
and that are above the famous Sander and Schneider curve that tells you
when an alignment is significant, against the sequence of the PDB file. This
is useful for stability engineering because a) You can get ideas where
to make mutations (don't mutate a conserved residue because if it is
conserved it is important even if you don't know what it is important for);
b) It can make suggestions about what to mutate like introducing a
cysteine bridge; c) It can tell you what not to do. If you want to introduce
a proline somewhere, but there is no proline found in the alignment at that
position, then that proline might not be a good idea.
- Given two sequences: DCAGWLPYTXGE and DCAGWLGHAARTSPAFGLREWPYTXGE. Both are
equally thermostable. In both cases residue X can be mutated to become a
cysteine that bridges with the other Cys at position 2. Which peptide is
most stable after the mutation (neglect side effects like atomic clashes,
strain, etc) and why?
The second one because the introduction of a cysteine bridge improves
stability by reducing the freedom in the unfolded form. The further
the cysteines are away from each other in the sequence, the more entropy
is lost in the unfolded form that is then not lost upon folding.
- Why is a hydrogen bond with the own backbone better for Asp than for Glu?
In both cases you gain enthalpy of the interaction at the cost of losing
the entropic freedom of the side chain. Glu has more freedom than Asp,
so it has more to loose when getting fixed in a H-bond.
- What can we learn from structure comparisons?
The correct answer is "a lot". But at the exam, I would appreciate
if you would elaborate a bit. Some things you can learn is a) how conserved
is the position of a loop; b) Is a water bound in the active site always
there; c) Is the odd rotamer observed at a certain position conserved
and thus important, or the accidental result of crystal packing; d) Do all
ligands always make a contact with a certain (conserved) residue?
- Why do all good secondary structure prediction methods use multiple sequence
alignments as input?
Best explained by an example. If you see something
like DALWAMPKLLELMLQ then you think "Wow, what a beautiful helix, except
for that shitty Pro in the middle. If you now see that most homologs have
an equally beautiful helix pattern without that Pro, then you know that that
Pro is an exceptional Pro in an otherwise nice helix.
- What is the major problem when using force fields?
Several answers are correct here. One is determining what is the null-model,
i.e., how do you define the situation in which everything is random? This
often comes out to be an intellectual challenge. Another problems is that
using a force field, like for example in molecular dynamics, takes very, very
much CPU time to do it right. Sometimes there isn't enough data to properly
design the force field from. Don't worry, in 2011 I would formulate
such a question a bit less 'open-ended').
- What types of motion are important for enzyme activity?
A whole lot. Almost in order of appearance in the plot: a) The protein
happily wobbles around with a bit of freedom here and there (most in the
side chains at the surface); b) Protein and substrate swim around due to
the Brownian motions (sometimes supported by gradients or even active
transport processes); c) The protein meets the substrate and this normally
leads to a reduction of freedom in the ligand, to water from the active site
pocket gaining entropy when it is replaced by the ligand, to protein active
site residues 'locking in' on the ligand, and sometimes to an induced fit
of residues and even loops in the vicinity of the active site; d) In some
enzymes whole domains move with respect to each other, often in a hinge
like motion that opens and closes the active site a bit. In the presence
of the ligand the closed form is frozen in; e) The motions at or near the
quantum level (vibrations, electrons not being at the most likely position,
etcetera) that always take place now become important as they make sure that
the enzymatic activity actually takes place; f) In most enzyme actions some
water is split, and some hydrogen and/or some elctron moves from one place
to another (and another, and another...); g) after the action took place the
whole process goes in reverse, so the product(s) leave the active site, water
gets back in, induced fits are uninduced, domain motions atrt again, etc.
- Give two significantly different classification schemes for membrane proteins.
One scheme would be by function (transporters, receptors, defense proteins,
etc), but you can also use their structure: Classify proteins by their
transmembrane structure that either is a big circular sheet of strands,
or a bunch of helices. Both groups can then be sub-classified by the number
of strands in the sheet, and the number of helices in the bundle.
- Mention the two major classes of molecules that transfer information through
a membrane.
G protein-coupled receptors (or GPCRs), and receptor tyrosine kinases.
Especially the GPCRs are a very important target in the drug design
industry.
- Mention the five most important computational tools for a bioinformatician
(and what can you do with them)?
The clear winner is Google, especially in combination with the Wikipedia.
PubMed (where you get access to the literature) is important too.
Number 3 is BLAST. And after these first three, one can start defending
different options. Linux has once been answered (and defended) with success.
MRS, or, more general, database lookup software belongs high on the list,
as does moleculat visualisation for which we used YASARA during the course.
Multiple sequence alignment software must be among your top-5 too.
- BRIEFLY describe all steps in homology modeling and mention the most serious
problems encountered at each step.
See the homology modelling seminar, sorry, I am not going to type the whole
story again.
- You get the freshly determined coordinates of an endo-glycanase. Mention
several possible ways in which a bioinformatician can determine the active
site (residues). (And which active site residues do you expect?)
I would first run BLAST against SwissProt and see if the sequence or any of
its close homologs has the active site annotated by the SwissProt experts.
But if that doesn't work, make a multiple sequence alignment and look for the
conserved residues. If there are too many conserved residues, see if the
structure is known or build a model. The conserved residues at the bottom
of the biggest surface dent are your best bet (because ligand binding
goes best if water is freed-up to gain entopy of water).
- What can you tell about this peptide:
MNNSAKALTRRGGALTLLAIVLLTLWAIVFMLLLIAFFGGSADA A proteomics experiment
indicates that this peptide is 79.9 daltons to heavy (i.e. phosphorylated).
Which residue holds this PO4 group (and why)?
The stretch ALTLLAIVLLTLWAIVFMLLLIA is a transmembrane helix. There are
three positive residues in KALTRR, so KALTRR is at the inside (cytosolic
side of the membrane) because of the positive-in rule. PO4 normally is
bound to a serine (albeit that it can also bind to Thr or Tyr). The only
two candidates are the serines in NNSAK and GGSAD. As phosphorylation
is a cytosolic process, the serine you are looking for is the first one.
- A weird bacterium lives a normal life at pH 4.8 in a lake with pH 7.2. This
bacterium uses the antibiotic peptide:
NNGLLLAILMLSLLLAAIVVLLGDGDGNPPP
to kill other bacteria that compete for its food. It stores these peptides
in a peptosome that, if need arises, quickly presents these peptides to a
transfer system that (one by one) brings the peptides across the membrane.
a) Guess how much energy is (minimally) needed per peptide transfer.
b) Describe in some detail how you would calculate this delta-G with a
molecular dynamics program.
In 2011 this is no longer an aspect in the course, but you should still know
that the rule of 10 exists (see the video on this topic in the video section).
If you pump a charge into a gradient, you pay 1 kCal/Mole per pH unit
difference per charged residue (i.e. per charge that goes into the gradient;
charges that go with the gradient can pay back that energy).
The MD story was dropped from the curriculum (albeit that the scheikos
should remember from their version of bioinformatics 1...).
This was Vriend's theoretical part of the exam in 2006.
Feel free to answer in Dutch, English or German. I can only give points for
an answer if I can read it. I can only overlook errors when I can read it
very easily. When in doubt, the shorter answer is virtually guaranteed better.
The amount of white space is indicative for the amount of words I think you
need for the answer. When an explanation of the answer is not explicitly
given, it seems wise to only give an explanation when you are not sure
about your answer.
- A threonine is buried deeply inside the protein protalionase. Its Oγ
doesn't make a hydrogen bond. Which is the best mutation for improving the
stability of protalionase? Briefly describe why.
- How do you make an antibody against the toxin of the Texan dessert snake?
Describe which bioinformatics tools are needed in the process.
- Why do all good secondary structure prediction methods use multiple
sequence alignments as input?
- Why does a salt bridge care (much) less about the inter-atomic distance
than a Van der Waals interaction?
- What types of motion are important for enzyme activity?
- Which is the driving force to keep membranes intact?
- The bacterial extracellular paravilon receptor has a sequence that
starts with GKNRSKTLLLAILWYLSLLALIMLFFACWLLAINGDSDNG....
This is the major fragment that is always found in proteomics experiments.
Sometimes, in those proteomics experiments, this same fragment is found, but
about 80 Dalton's too heavy. That must mean phosphorylation. But which
Serine is phosphorylated? Briefly explain why it is not any of the other two.
- The following sequence fragment was found to contain the active site of
Cyclomaltodextrin glucanase. Which are the (two most important) active site
residues? Explain Briefly.
....LVGGNTSGDVTIKVESGNSPDLALRAALELAGGSNSEVTVEVTGDSGNRTK....
- Why are active sites always located at the bottom of a dent or cave?
- Why do many transmembrane helix prediction programs often predict
too short helices?
- Which are the preferred residues (side chains) to bind Zinc in a protein?
And which for Calcium?
- Why do small proteins often have more cysteine bridges than big proteins?
Think of ALL aspects of this problem (this problem might have more angles
and viewpoints than you might initially think).
- Your boss wants you to write a secondary structure prediction program. He
suggests you use the Chou and Fasman method (you know, the one that relies
on one parameter per amino acid per secondary structure type). How would
you proceed? Don't skip the details!
- Why are recognition sites in DNA often more AT-rich than CG-rich?
- Why is Asp a 'better' active site residue than Asn?
- Mention two ways to computationally detect cysteine bridges.
Gert Vriend's theoretical part of the exam in 2008
- What does the active site of the average Zn-protease look like, and how does
a Zn-protease work?
The Zn is normally bound by more than one histidine and occasionally also 'something
with an oxygen (Glu, Asp, Tyr, etc). The Zn tends to be involved in activating the
water molecule (splitting it and stabilizing the 'half' waters).
- Mention several ways to find-out which cysteines are bridged in a protein.
1) Look it up in SwissProt; 2) Load its structure in Yasara and look for
bridges; 3) Buld a homology model and proceed as step two; 4) Do a multiple sequence
alignment and look for pairs of cysteines that show correlated behaviour (i.e.
that are present together and absent together); 5) Search the internet for cys-cys
bridge predictors; 6) I know it is dirty, labour intensive, smelling, and painful,
but you could of course try to measure it in the lab.
- A cysteine in the following peptide is myristoylated, and an asparagine that
is very far away from it in the sequence is glycosilated. Underline the myristoylated
cysteine and the glycosilated asparagine.
TLSNATCSGLWILAMVLLAMILSLAMVVLAMTRKAACGNATAQAG
The bit LWILAMVLLAMILSLAMVVLAM is a transmembrane helix. The RK after this helix are
positive charges and thus cytosolic. Glycosilation is an extracellular process, and
thus at the other side of the helix from RK. So, the N is SNA is glycoslilated and
the C in ACG is myristoylated.
- If you want to stabilize a protein by the introduction of a hydrogen bond
you must make much more precise prediction than when you want to stabilize a
protein by the introduction of a salt bridge. Why?
Hydrogen bonds require proper orbital overlap, so that sticks very precise, while
a salt bridge works with q1 * q2 / r in which
r is the distance between the charged groups q1 and q2.
A 1/r relations is much more fault tolerant than orbital overlap that deals with
fractions of Ångströms.
- Mention a few terms (formulas not really needed) that are used in the force
field of a molecular dynamics software package. And mention a few terms that
generally are not yet in use in such force field?
In: Bond lengths, bond angles, torsion angles, VdW and charge-charge interactions;
Sometimes also in: H-bonds, planarities. Out: pi-pi stacking, induced polarities,
quantum related effects.
- A metabotropic glutamate receptor has seven transmembrane helices and a big
extra-cellular N-terminal domain of a few hundred amino acids. Which is the
dominant driving force that keeps that extra-cellular domain together? And
which is the dominant force to keep the transmembrane helices together?
The entropy of water and the entropy of lipids, respectively.
- The four cysteines in this protein are bridged 1-3 and 2-4. One of the
four cysteines must be mutated so that a free cysteine is left to which a label
can be attached. From a protein stability point of view, which cysteine bridge
will you destroy and which one will you leave intact? And Why?
LVGGNCSGDVTEVTVEVTGDSIKVESGNSPDLACRCALECAGGSNSGNRTK
.....1...........................2.3...4...........
I added the bar and the numbers as part of the answer. The stabilizing effect
of a cysteine bridge is bigger when the cysteines are further away from each other in
the sequence. So, don't touch the bridge 1-3.
- Mention at least five (different) roles for water in living cells.
The entropy of water is the driving 'force' for many biological processes including
keeping proteins folded; Water is the solvent in which everything happens; Water is
involved in nearly all enzymatic reactions if not as a substrate or product, then as
a part of the catalysis; Water can be part of recognition like bridging hydrogen bonds
between protein and DNA; Water often also sits inside proteins to keep hydrogenbonding
groups as satisfied as is possible.
- The protein blablase has the sequence:
L E A L M L G P V T I T V T I
1 2 3 4 5 6 7 8 9 0 a b c d e
H H H H H H L L S S S S S S S
Draw a fancy Ramachandran plot, and place the digits that I listed underneath
the sequence at plausible locations in the plot.
First predict ist secondary structure (I added that as part of the answer...).
The Hs get crosses at around -50,-50 in the helix area. The Strands at around
-150,150, and the turn/loop GP gets two crosses in any of the three areas (Helix,
Strand, Left-handed helix). However the G cannot be helical, and the P not
strand.
- Why do we find more A and T in promotor binding sites than C and G?
In promotor regions DNA has to wind/unwind/open-up. AT heas only two H-bonds, and thus
can more easily undergo structural changes than GC.
- The 20 normal amino acids together have in total 163 atoms. If we want
to make a very fancy force field to determine the all-atom contact energy, we
can do that by making for each of the 163*163 possible inter-atomic interactions
(actually we need to do roughly half of those, of course) a histogram in which
we represent how often two atoms were found at a distance between 2.5 and 2.6
Angstrom, how often they were found at a distance between 2.6 and 2.7 Angstrom,
etc.
If we really were to do this and count all the distances in 25000 proteins in
the PDB, what would the plots look like for the contacts between backbone
carbonyl oxygens and the O-gamma of serine? And what would the plot look like
for the contacts between The Alanine C-beta and the Methionine C-epsilon?
I want you to draw both distributions as curves (not as histograms, that is too
much work and becomes too unclear). Draw them both for the distance range 2-10
Angstrom in the same plot so that the lines (essentially) overlap at higher
distances. Draw the carbonyl-serine line solid and the alanine-methionine
line with a finely dashed line.
This is no longer part of the course in 2011.
Gert Vriend's practical part of the exam in 2008
(There will be no practical part in 2011)
Below two questions are listed. Make only one of the two.
1) Uracil-DNA Glycosylase
a) What is the function of this molecule?
b) Describe its structure?
c) Are you surprised by the location of the active site? And why, or why not?
d) Many Uracil-DNA Glycosylases contain an iron-sulfur cluster. What is the role of this cluster?
e) Describe and schematically draw the iron-sulfur cluster.
2) ZIF268 ZINC FINGER-DNA COMPLEX
a) What is the function of this molecule?
b) Why is the DNA recognition of this protein more specific than that of many other DNA binding proteins?
c) What is the role of the Zinc atoms in this molecule?
d) Mention at least ten protein-DNA contacts that ZIF268 'uses' to achieve specificity.
e) Describe and schematically draw the direct environment of the middle of the 3 Zn ions.
And here is the 2009 exam, with video answers.
Here one question from 2009
that is likely to be used again one day...
2011 version with answers.
2013 version with answers.
Some hints for the infamous amino acid test
This section is mainly meant to help the Scheikos who had the misfortune of
never doing the amino acid test in their life yet, but perhaps others might
like this page too.
The main concept is that:
If you understand the amino acids,
you understand everything.
|
First the infamous amino acid exam (that will guaranteed be part of the amino
acid test for the scheikos):
If you want to know more about the amino acids, you might want to take a
look at the following twenty (short) videos that each tell you a few things
about one amino acid. These are the special characteristics (that is the
right-hand side column in the amino acid test):
Click on the picture to make it rotate. Click on the camera for a small video
on that amino acid.
Alanine
Cysteine
Aspartic acid
Glutamic acid
Phenylalanine
Glycine
Histidine
Isoleucine
Lysine
Leucine
Methionine
Asparagine
Proline
Glutamine
Arginine
Serine
Threonine
Valine
Tryptophan
Tyrosine
The other columns
Amino acid names
The names of the amino acids you will need to learn by heart, I am sorry,
I know how bad this feels, but trust me, it will help you a lot
during the rest of the course.
Amino acid size
The amino acids can be sorted by size as:
In this list we simply counted non-hydrogen atoms. Obviously, the S atomes are
much heavier than the C, O, and N atoms, and we don't count protons, but I think
this is precise enough to understand what you are looking at when you see a protein,
and intelligently looking at and computing on protein structures are the
main topics of the course. You can also look at the
masses of the
residues and will give a slightly different picture.
I personally prefer to remember the word LIND (coloured red in the
amino acid series GASCTVPLINDMQEKHRFYW) for the
intermediately large ones. As it is easy for a scheiko to at least remember
the approximate structures of the amino acids, it should be easy to remember
which amino acids are swall and which are big,
Obviously, it would also be OK to use:
GASTVPCLINDMQEKHRFYW,
GASTVPCLINDMQEKHRFYW, or other variants. But the
solution GASTVPCLINDMQEKHRFYW, although correct, will
not give you many points.
Amino acid hydrophobicity
Hydrophobicity is a difficult concept, related to the
entropy
of water. But for the amino acid test you need to remember that FCVIPWALM are hydrophobic, YST have
intermediate hydrophobicity, and the rest (DENQHRK) are hydrophilic. The only exception is G that has no
side chain, and thus has a hard to define hydrophobicity (undetermined).
Amino acid charge
The charges of the negative residues are known ones you know their names (aspartic and glutamic ACID).
The other three you will have to learn by heart RK are always positive, and H can be positive, neutral,
and negative. At physiological pH histidine is normally neutral (90%), often postively charged (9%),
and occasionally negatively charged (1%) in your body.
Amino acid secondary structure preference
There exists an audio seminar on this topic. But you can also try to remember
Helix : AMELK
Strand: VITWYF
Bturn : PSDNG
|
And those are pronouncable words...
Closing remarks
Please be aware that your biochemistry books were written by non-bioinformaticians. I know there are
many books going around at the science faculty that list, for example, cysteine as hydrophilic. And
that is wrong.
And if you are interested, feel free to poke around at:
Amino acid background material
The table below lists 5 items. You should now briefly look at this material, and study it either
at your leisure at home, or when you need it while going through the questions later today or in
the coming weeks. The five items are:
- Once more the amino acids. This is just one picture with the
covalent structures of the amino acids shown. You can use this picture if you forgot to take the
orange NBIC amino acid sheet.
- Physico-chemical properties. This is
a pointer to a long, long list of tables with amino acid properties.
In this physico-chemical properties section you can for now
skip most parts, but you should look at the parts about solvent
accessibility (1.5), chemical classification (3.1), hydrophobicity scales
(3.2), and the genetic code (5.1). The other sections will be discussed later.
- The bedtime story about the amino acids
explains the functional characteristics of the amino acids. This part overlaps with the seminar, but here
you get it explained by somebody else, and sometimes that helps clarify things.
- Information for the real fanatics
The information for the real fanatics forms nice reading, but does not
contain material required to do well in the examination.
- The infamous amino acid test is listed so that you can print
it when needed/desired.
Some old exams
Putative exam questions
This is a list of questions that I can imagine to end up in future exams...
Mutations and stability
- Give me five very different reasons why somebody might want to mutate a protein.
- When we want to make a protein more stable, we can use the concepts entropic
stabilisation and ethalpic stabilisation. a) Explain what is meant with these
two terms. b) What are the differences, and what do they have in common?
- I want to make my protein more stable, so I decide to mutate a very exposed isoleucine into
an aspartic acid, to make the surface of my protein more hydrophilic. Explain why this
is a stupid plan.
- What is helix capping? and how can it be used to make a protein more stable by mutagenesis?
- When working on increasing the stability of my protein by mutagenesis, I often consult a table
that holds all the (backbone) torsion angles. Why?
- What is a rotamer? And what makes that certain residues in certain secondary structures have
only a limited number of rotamers available to them?
- The stability of a protein normally is detefined as the ΔG of the U<-->F process in which
U stands for fully unforlded and F stands for folded protein. For industrial protein applications
this often is not a good definition. Why not?
- If I want to make a protein more stable I can try to make mutations that add extra hydrogen bonds. I can also
try to make mutations that add more saltbridges, and that is easier than doing it by making
hydrogen bonds. Why?
DNA and RNA
- DNA normally has a major groove and a minor groove. What is the difference between those grooves?
- Describe some protein-DNA interactions (in detail, including the names of the atoms involved) that
can contribute to protein - DNA binding specificity.
- When transcription factors bind to DNA, they need to do that with some specificity. Explain, for example
for the TATA-box binding protein, how this specificity is achieved.
- Which atoms are involved in H-bonding in Watson Crick base pairing?
- A protein specifically recognizes the human DNA sequence ACCAC (and it counter strand TGGTG). a) At how many
places can this protein bind to the human DNA? b) This protein nevertheless binds the DNA rather specifically.
How is that possible?
Amino acids structure and function
- Aquaporins will let water true in two directions. a) What determines in which direction the water
will go through? b) How come aquaporin lets water through at a high speed, and potassium and sodium, for example,
almost not at all. c) Part of the selectivity of aquaporin is caused by interaction with the water dipole; which
residue (type) has the crucial role in this dipole interaction? d) Ps, what is a dipole? e) Draw the dipole of water
and draw the dipole of this one special residue.
- Why do we normally see two aspartates involved in binding Ca2+, and ten time less often two glutamates? b) How often
do you expect to see a Ca2+ bound by two glutamates? c) Why? (Ps, obviously there are more atoms involved in
the binding, but I only look at the negatively charged ones for this question).
- Nature often wants to temporarily store an electron on a metal ion. That doesn't just happen like that. Proteins
that hold the metal ions on which the electron sits for a while need to do something to compensate. Which
tricks do the proteins have up their sleeve to accomodate electron storage?
- Why does nature often use copper or iron ions to 'store' electrons in electron transport processes, and why not sodium
or calcium?
Protein details
- What is the difference between a bond angle and a torsion angle?
- Why do we have the term n in the energy term for torsion angles V = K*(1+cos(n*φ-φ0),
and which values can it have when we do MD on a normal protein?
- In the Ramachandran plot for all aspartic acids in the PDB I see more crosses outside the contour lines than
in the ramachandran plot for all glutamic acids. Why?
- Draw the backbone of gly-gly-gly. Indicate φ, ψ, and ω, and the atoms involved in each of them.
- Why is a saltbridge between residue 17 and 43 more important for stability than a saltbridge between
residues 23 and 34?
- What formula is used in molecular dynamics and energy minimisation calculations
to calculate the energy contribution of a saltbridge to the total energy of the molecule?
- Why does the contribution of a saltbridge to the stability of the protein get less when we
dissolve the protein in 1 M NaCl?
- The formula for the electrostatic term in molecular dynamics and energy calculation software
contains some ε (epsilon) terms.
What do these terms stand for? If I add salt to the protein (in the simulation), will tha
make saltbridges stronger or weaker? And what should happen to the ε term(s) to account
for the salt? And does this all make sense in terms of what you know from the physical aspects of things?
- If an enzyme cleaves a dipeptide, many things are mobile and many things move. Can you describe the whole
cleavage process mentioning every type of motion along the path?
Interactions
- Give me a series of very different roles for water in a cell.
- What are the roles of sugars in a cell?
- Why do we know very much about the structure(s) of proteins, much less about the structures of RNA
molecules, and barely anything about the structures of sugars and membranes?
- When I see a sodium ion in a PDB file, I get worried that it might not be a sodium at all. Why do I get worried?
- When I see an Rh, Pd, Ta, or Os ion in a PDB file I get worried. Why?
- Which truc do ion-channels apply to make sure they only let through their own ion, and not just anything?
- Why are many potassium channels actually tetrameric molecules?
- Why do we know sodium pumps and potassium pumps, but not water pumps? We need to transfer not only sodium
and potassium over membranes, but also water, don't we?
- I see electron density for an ion that is bound by Asp 17 Oδ1, Ala 123 C=O, Gly 64 C=O, and
three waters. I see density for another ion that
is bound by His 34 Nδ1, His 38 Nε2, Ser 111 Oγ, Glu 117 Oε2, and one
water. And I see density for a third
ion that is bound by Asp 8 Oδ1, three waters, Thr 14 Oγ, and Glu 22 Oε1.
I know from proteomics measurements that the ions must be sodium, magnesion, and Zinc. Which of the three
sites holds which of the three ions?
- If I see an ion in an active site, it normally is a Zn, Fe, CU, and essentially never a Na, K, or Ca. Why?
Protein structure and validation
- How do X-ray crystallography and NMR work? What kind of data is collected? How is that data converted
into structure coordinates?
- What are the advantages and disadvantages of NMR and X-ray when determining protein structures.
- When we use either NMR or X-ray to determine the structure of a protein, we make different types
of experimental errors (both different types of systematic errors and different types of random errors). Why?
And which are the more significant errors that are radically different for both methods?
- What is an R-factor? And what is a B-factor?
- The WHAT_CHECK software checks if ions are correct or actually should be another ion. Why? I.e. how come
something as important as an ion can be measured incorrectly by X-ray or NMR?
- Several structure validation programs provide lists of residues that should be flipped. What is meant with a flip,
and how come such errors are being made upon solving structure coordinates experimentally?
- What is a helix dipole? Is a helix dipole good for the stability of the folded protein or bad? What does nature
use the helix dipole for?
- What is helix capping?
Bioinformatics techniques
(In this section numbers do not need to be correct, but they must be plausible).
- Mention five different computational techniques that people have used over the years to predict the
secondary structure of proteins.
- What is the positive in rule? And when does a bioinformatician use it?
- Which forces are commonly applied in molecular dynamics software?
- What is the 'time step' in molecular dynamics? And what is a good value for the time step?
- What is the average speed of a Leucine Cδ atom in a protein (in m/sec)? So, how far does this atom
move in one MD time step?
- How does molecular dynamics work? Explain what the computer program does per time step, and explain what happens
from time-step to time step.
- What is the definition of a force field?
- How would you make a force field to predict transmembrane helices in protein sequences? (If the answer does not
hold the words 'null-model' and 'validation or callibration of the method' it will be wrong...).
- If I want to predict the secondary structure of water soluble proteins, I normally use a machine
learning technology called Artificial Neural Network. If I want to predict whether a helix is transmembrane or
not, I prefer using a Support Vector Machine. Why do I use such different machine learning techniques
for two seemingly similar problems?
- Try to think of three biological questions that require a machine learning based approach to get them answered.
- What is the difference between a Decision Tree and a Random Forest?
- How would you predict the secondary structure of proteins with a Artificial Neural network? I.e. what
would the input look like? What would the neural network look like? How would you train the neural network?
How would you test the neural network?
Other questions
- The protein's structure and function often are causally related. Can you think of
three very different examples that illustrate such a causal relation?
- What is the average distance between the active site of one thermolysin and
the nearest surface exposed loop in a neighbouring thermolysin in a 10 Molar solution?
b) How fast does thermolysin swim in water at room temperature?
c) How often do two thermolysins meet in that 10 Molar solution?
d) So why does it still take hours for thermolysin to canibalistically clear itself from the solution?
e) Does that evolutionary make sense?
- In the active site of enzymes I often find a serine but hardly ever a threonine. Why?
- If I make one Ramachandran plot for 50 proteins, but I only show one residue type, then it looks as if
the Ramachandran plot for all the aspartic acids is a bit 'worse' (i.e. more residues fall outside the
contour lines for good phi-psi combinations) than the plot for all glutamic acids. Why?
Non-exam questions that you should be able to answer anyway...
How many virus particles flow under the Waal-bridge per day? And how big is the total volume of those particles?
Despite all these viruses, we can swim in the Waal in the summer without (normally) getting sick. Why?