Output explanation

The servers use WHAT IF, and thus you get WHAT IF-like output. For regular WHAT IF users that output makes sense, but that might not be the case for you. We therefore explain here some of the output formats that are often used.

  1. Atomic information.
  2. Atomic interactions.
  3. PDB file content.
  4. Residue numbers.
  5. Atom numbers/names.
  6. Residue number input.
  7. Output in TeX format.

Explanation for the atomic output per residue

The servers use WHAT IF, and thus you get WHAT IF-like output. The WHAT IF option for displaying all possible information is called LISTA. A typically LISTA output is given below.

The first line gives about the information for one residue. Prp is the residue property value, this value will be zero in the output on most servers. A few servers calculate a value for every residue. This result is stored in this so-called residue property value.

The second line is just a header. Between these first two lines more information can be given in case this residue is member of a family or a cluster (and if this server uses families or clusters), or in case WHAT IF has corrected or mutated this residue.

Residue:    37 ASP  (  37 ) E     (Prp= 0.00)
Atom    X     Y     Z   Acc   B   WT   VdW  Colr   AtOK  Val
 N    18.2  59.6  -5.1  0.0 16.7  1.0  1.7  340     +    0.00
 CA   17.0  58.8  -5.2  1.7 16.0  1.0  1.8  240     +    0.00
 C    16.9  57.7  -4.1  1.6 23.4  1.0  1.8  240     +    0.00
 O    16.1  56.9  -4.2  2.7 19.6  1.0  1.4  120     +    0.00
 CB   16.8  58.2  -6.6  3.5 16.8  1.0  1.8  240     +    0.00
 CG   16.6  59.3  -7.6  4.0 43.8  1.0  1.8  240     +    0.00
 OD1  16.0  60.4  -7.1  7.6 41.3  1.0  1.4  120     +    0.00
 OD2  17.0  59.2  -8.7  6.0 42.4  1.0  1.4  120     +    0.00
  *1    *2    *2    *2   *3   *4   *5   *6   *7    *8      *9

The last line (the one with *1 *2 etc., on it) is not part of the output but added here to guide you to the column by column explanation given below.

  1. The atom names.
  2. The coordinates in Ångstrom
  3. The accessible molecular surface area (only zeros indicates buried or not calculated yet, that depends on which server you used).
  4. The crystallographic B-factor. >60 means this atom is for sure not here....
  5. Weight. This is almost always 1.0. If 0.0 the coordinates were modeled. If between 0.0 and 1.0, alternative conformations have been observed.
  6. The Van der Waals' radius for this atom. (Using the WHAT IF defaults:C:1.8 Ångstrom; O:1.4 Ångstrom; N:1.7 Ångstrom; S:2.0 Ångstrom.).
  7. The colour for this atom. (Only used by servers that also produce graphics output).
  8. Is-atom-OK flag. Atoms that are wrong (or missing) according to WHAT IF get a minus in this column.
  9. The atomic value. Several servers calculate values for each atom. Those values are displayed in this so-called atomic value column.

Sometimes some columns are added to the type of output described above. For example, the vacuum accessibility server produces output like:

Residue:    46 ASN  (  46 )       (Prp= 0.00)
 Phi=-112.9 Psi= 162.3 Omega= 178.4
Atom    X     Y     Z   Acc   B   WT   VdW Colr   OK  Use  Vac.   %
  N    14.0   6.5  13.7  0.0  5.8  1.0  1.7 340    +   -   0.8   0.0
  CA   13.5   5.4  12.9  1.0  6.2  1.0  1.8 240    +   -   1.2  85.7
  C    13.3   5.9  11.5  2.3  6.6  1.0  1.8 240    +   -  12.9  17.6
  O    13.7   6.9  11.0  0.7  7.2  1.0  1.4 120    +   -   8.8   8.4
  CB   12.3   4.8  13.5  0.5  7.3  1.0  1.8 240    +   -   9.4   5.6
  CG   12.5   4.3  14.9  0.0  8.0  1.0  1.8 240    +   -   2.4   0.0
  OD1  12.0   4.8  15.9  3.4 11.0  1.0  1.4 120    +   -   8.9  38.1
  ND2  13.4   3.3  15.0  9.7 10.3  1.0  1.7 340    +   -  15.4  62.6
                        17.6                              59.9  29.4

But in such cases the extra output is trivial. Here the right two columns do of course give you the accessibility in vacuum and the ratio between normal and vacuum accessibility as a percentage. The extra numbers on the bottom are residue wide summaries.

Explanation for the atomic interaction output

Often WHAT IF produces output for contact events at the atomic level. Contact analysis is the most trivial example, but also hydrogen bond calculations and several error detection options do this. Such output typically looks like:

.....
   7    1 THR  (   1  ) A     N   <>  35 ILE  (  35  ) A     N    D=  4.36  H-ene=  -- 0  Sym=1 (B-B)
   8    1 THR  (   1  ) A     N   <>  35 ILE  (  35  ) A     C    D=  4.09  H-ene=  -- 0  Sym=1 (B-B)
   9    1 THR  (   1  ) A     N   <>  35 ILE  (  35  ) A     O    D=  2.89  H-ene=  0.66  Sym=1 (B-B)
  10    1 THR  (   1  ) A     CA  <>  35 ILE  (  35  ) A     O    D=  3.93  H-ene=  -- 0  Sym=3 (B-B)
  11    1 THR  (   1  ) A     C   <>  35 ILE  (  35  ) A     N    D=  4.08  H-ene=  -- 0  Sym=4 (B-B)
.....
  *1   *2  *3    *4    *5     *6  *7  *8  *9    *10   *11   *12     *13          *14       *15   *16

The last line (the one with *1 *2 etc., on it) is not part of the output but added here to guide you to the column by column explanation given below.

  1. Number of the contact, simply runs from 1 till N.
  2. Residue number of first residue partner in the contact event.
  3. Residue type of first residue partner in the contact event.
  4. The number in brackets is the residue number found in the PDB file.
  5. Chain identifier of first residue partner in the contact event.
  6. The atom if the first residue partner that makes the actual contact.
  7. The pair of arrows (<>) indicates a contact...
  8. Residue number of second residue partner in the contact event.
  9. Residue type of second residue partner in the contact event.
  10. The number in brackets is the residue number found in the PDB file.
  11. Chain identifier of second residue partner in the contact event.
  12. The atom if the second residue partner that makes tha actual contact.
  13. This is the distance between the centers of the two atoms that make the contact. So even if you ask for contacts where the contact distance is, for example, < 1.0 Ånstrøm, you still get the distances listed between the atom centers.
  14. If the contact is also a hydrogen bond, than the hydrogen bond "energy" is listed. This "energy" can range from 0.01 for a very poor hydrogen bond till 1.0 for a perfect hydrogen bond.
  15. A few servers use symmetry contacts in the crystal too. If so, the matrix used to arrive at this contact is listed in this column. All matrices are listed at the bottom of the output for those servers where symmetry is switched on.
  16. Indicator of the type of contact. B stands for residue backbone; S for residue side chain; C for carbohydrate or sugar; W for water, D for ligand, drug, or ion. DNA, RNA, and amino acids count as residues in this option. A period is used for atoms that don't fall in any of the afore mentioned categories.

Hydrogen bond related output

There are two ways to calculate hydrogen bonds with WHAT IF. The first, and oldest way returns all potential hydrogen bonds, including hydrogen bonds that might be mutually exclusive. The second method tries to find the energetically optimal set of hydrogenbonds. Using the first method all hydrogen bonds are mentioned that have parameters that fall within the cut-off parameters. Using the second method you will not get to see all putatively possible hydrogen bonds, and not even always the bets one for each atom because the whole hydrogen bond network is optimised rather than individual hydrogen bonds.

All potential hydrogen bonds

The potential hydrogen bonds option typically returns:

Hydrogen bond related parameters *1
Maximal donor - acceptor distance ............ 3.50
Maximal hydrogen - acceptor distance ......... 2.50
Maximal angular error donor - H - acceptor ... 60.00
Maximal angular error H - acceptor - xxx ..... 90.00

   1 THR  (   1 ) A      N   <-->  35 ILE  (  35 ) A      O     *2
 D(DA)=  2.89 D(HA)=  1.94 A(H)= 21.7 A(A)= 25.3 H= 17.04 14.84  4.29
  *3           *4          *5          *6        *7
  1. The cut-off parameters are explained in our courses (http://swift.cmbi.ru.nl/teach/), or, for example, in http://swift.cmbi.ru.nl/teach/HOMMOD/seminars/secStruc.pdf.
  2. These are the two atoms that share the hydrogen.Their format is as explained under 'atom numbers/names'.
  3. Actually observed donor-acceptor distance (thus between the heavy atoms involved).
  4. Actual distance between proton and acceptor.
  5. Observed angular error over the proton.
  6. Observed angular error over the acceptor.
  7. Coordinates for the proton as used in the calculations.

This option returns hydrogen positions, but please do not try to uuse those positions for follow-up scientific work. These are merely listed as a sanity check. If you want protons added, please use the 'Add Protons to the Structure' server.

Optimal hydrogen bonding network

For an explanation of the scores and penalties used see: Positioning hydrogen atoms by optimizing hydrogen-bond networks in protein structures. R.W.W.Hooft, C.Sander, G.Vriend, PROTEINS (1996) 26, 363-376.

the output of these options typically is a list of Hbonds like:

   1 THR  (   1 ) A      N   ->    1 THR  (   1 ) A      OG1 Sym=   1 Val=  0.350  DA=  2.75  DHA= 79.48
In which all terms are similar or the same as the correspondingly named terms in the previous section.

File content

The WHAT IF program uses the famous 'SHOSOU' command to analyze the contents of a PDB entry.
A typical results from the SHOSOU command looks like:

    Contents of the SOUP:                                      *1   
 
Protein .................... : 2                               *2
Drug, ligand or co-factor .. : 1
DNA or RNA ................. : 0
Single atom entity ......... : 7
(Groups of) water .......... : 1
Drug with known topology ... : 0
 
 Molecule      Range              Type              Set name   *3
     1    1 (    1)  316 (  316)E Protein           set        *4
     2  317 (  322)  318 (  323)D Protein           set        *4
     3  319 (  O2 )  319 (  O2 )E K O2 <-           set        *5
     4  320 (  317)  320 (  317)   CA               set        *6
     5  321 (  318)  321 (  318)   CA               set
     6  322 (  319)  322 (  319)   CA               set
     7  323 (  320)  323 (  320)   CA               set
     8  324 (  321)  324 (  321)   ZN               set
     9  325 (  324)  325 (  324)  DMS               set        *7
    10  326 (  O2 )  326 (  O2 )D L O2 <-           set        *8
    11  327 ( HOH )  327 ( HOH )  water   ( 157)    tnl        *9
   *10  *11   *12    *13    *14   *15               *16
  1. This is the header of the SHOSOU output
  2. First the contents of the soup is counted
  3. This is the header of the real thing of the SHOSOU command. The set name (that is the name the user gave to the ensemble of molecules added to the soup with one single GETMOL or GETGRO, etc., command.
  4. Molecule one is a protein with chain identifier E. This protein has 316 amino acids. The second protein is a two residue peptide with chain identifier D.
  5. The third molecule is the C-terminal oxygen of chain E. It is attached to a Lysine (that is indicated by the character K) and the arrow indicates that it is bound to something.
  6. Molecules 5 till 8 are single atomic entities (together with the two C-terminal oxygens they form the seven single atomic entities mentioned in the top half of the output.
  7. DMS probably stands for DMSO, and is a drug, ligand or co-factor. For WHAT IF drug, ligand, and co-factor are all the same thing.
  8. This is the C-terminal oxygen of the second molecule. You can see that because the O2 indicates that it is a C-terminal oxygen. The D indicates that it is part of the D chain and the arrow indicates that it is bound to something. The L indicates that it is bound to a Leucine.
  9. This is a group of 157 water molecules.
  10. The 'molecule' number.
  11. The WHAT IF number of the first residue in this molecule.
  12. The PDB number of the first residue in this molecule.
  13. The WHAT IF number of the last residue in this molecule.
  14. The PDB number of the last residue in this molecule.
  15. A short description of this molecule.
  16. The so-called set-name is only relevant when WHAT IF is used interactively.

Residue numbers

When WHAT IF lists a residue number, it gives a lot of information. E.g.:

3 LYS  (   5 ) A 12 
Means from left to right:
  1. This is the third residue in the PDB file.
  2. It is a lysine
  3. The number in brackets is the number found in the PDB file. This example strongly suggests that the first two residues could not be seen by the crystallographer or NMR spectroscopist.
  4. The character A is the chain identifier.
  5. The number 12 indicates that this residue sits in 12-th NMR model.

Atom numbers/names

When atoms are given, they normally are listed in the context of their residue. For example, atomic contact servers tend to give output like:

69 LYS ( 70) A  CA  <> 504 SO4 ( 256)   O3  D=  3.34 (B-D)
69 LYS ( 70) A  CB  <> 504 SO4 ( 256)   S   D=  3.94 (S-D)
69 LYS ( 70) A  CB  <> 504 SO4 ( 256)   O3  D=  3.30 (S-D)
70 SER ( 71) A  N   <> 504 SO4 ( 256)   O3  D=  2.70 (B-D)
*1  *2   *3 *4  *5  *6  *7  *8    *9   *10 *11  *12   *13
The line with asterisks relates to the explanations (*1 - *5 relates to the one atom of the contact pair; *7 - *10 to the other):
  1. Sequential number of the residue of the first atom of the contact pair
  2. Type of the residue of the first atom of the contact pair
  3. The number in brackets is the residue number found in the PDB file. (This example strongly suggests that the first residue could not be seen by the crystallographer).
  4. The A is the chain identifier of the molecule in which 69 LYS (70) sits. The second partner in the contact (*7 -*10) has no chain identifier...
  5. The atom name of the first atom of the contact pair
  6. <> stands for "makes a contact with"
  7. Sequential number of the "residue" of the second atom of the contact pair. Clearly the word "residue" is used loosely in this help file, because it normally refers to amino acid, nucleic acid, sugar, lipid, co-factor, or sometimes even ion, or group of water molecules.
  8. Type of the "residue" of the second atom of the contact pair.
  9. The number in brackets is the residue number found in the PDB file. In the case of ligands, these numbers can get any weird value...
  10. The atom name of the first atom of the contact pair
  11. D= simply stands for Distance=
  12. This number is the distance between the atoms.
  13. This code indicates the kind of contact. (B-D) indicates a Backbone atom in the first of the contact pair contacts a Drug atom in the second of the contact pair. The rest of the code is:

Output in TeX format

A few servers produce output in TeX format. You can use LateX to convert this output type to nicely printable output. LateX will need a file called supertab.sty. If you don't have this file around, please use this copy of supertab.sty and store it in the directory where you work with the TeX style output, or put it somewhere in your PATH.


Mail Vriend@cmbi.ru.nl if you have questions about these servers.
(C) G.V.
Last updated: April 14 2012