|
Definition of secondary structure of proteins given a set of 3D coordinates.
The DSSP program defines secondary structure, geometrical features and solvent exposure of proteins, given atomic coordinates in Protein Data Bank format. The program does NOT PREDICT protein structure. According to the Science Citation Index (July 1995), the program has been cited in the scientific literature more than 1000 times.
Wolfgang Kabsch and Chris Sander, MPI MF, Heidelberg, 1983.
Reference: Kabsch,W. and Sander,C. (1983) Biopolymers 22, 2577-2637
dssp [-na] [-v] pdb_file [dssp_file] dssp [-na] [-v] -- [dssp_file] dssp [-h] [-?] [-V]
Command line options:
In this example verbose mode was turned on to see the progress of execution for the large photoreaction center (1prc) input file.
unix% dssp -v 1prc.pdb 1prc.dssp !!! Backbone incomplete for residue ALA 333 C residue will be ignored !!! !!! Residue SER 273 L has 3 instead of expected 2 sidechain atoms. last sidechain atom name is OXT calculated solvent accessibility includes extra atoms !!! !!! Residue LYS 323 M has 6 instead of expected 5 sidechain atoms. last sidechain atom name is OXT calculated solvent accessibility includes extra atoms !!! !!! Residue LEU 258 H has 5 instead of expected 4 sidechain atoms. last sidechain atom name is OXT calculated solvent accessibility includes extra atoms !!! !!! Polypeptide chain interrupted !!! Inputcoordinates done 1189 Flagssbonds done Flagchirality done Flaghydrogenbonds done Flagbridge done Flagturn done Flagaccess done Printout done
Output file is 1ppt.dssp
In this example the coordinates of avian pancreatic polypeptide (1ppt) were first converted from star format to pdb format and then piped into dssp.
unix% star2pdb 1ppt.star | dssp -- > 1ppt.dssp !!! Residue TYR 36 has 9 instead of expected 8 sidechain atoms. last sidechain atom name is OXT calculated solvent accessibility includes extra atoms !!!
Output file is 1ppt.dssp
The output from DSSP on file myprotein.dssp contains secondary structure assignments and other information, one line per residue. Extract from 1est.dssp (simplified):
HEADER HYDROLASE (SERINE PROTEINASE) 17-MAY-76 1EST ... 240 1 4 4 0 TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, NUMBER OF SS-BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN) . 10891.0 ACCESSIBLE SURFACE OF PROTEIN (ANGSTROM**2) 162 67.5 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(J) ; PER 100 RESIDUES 0 0.0 TOTAL NUMBER OF HYDROGEN BONDS IN PARALLEL BRIDGES; PER 100 RESIDUES 84 35.0 TOTAL NUMBER OF HYDROGEN BONDS IN ANTIPARALLEL BRIDGES; PER 100 RESIDUES ... 26 10.8 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+2) 30 12.5 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+3) 10 4.2 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+4) ... # RESIDUE AA STRUCTURE BP1 BP2 ACC N-H-->O O-->H-N N-H-->O O-->H-N 2 17 V B 3 +A 182 0A 8 180,-2.5 180,-1.9 1,-0.2 134,-0.1 TCO KAPPA ALPHA PHI PSI X-CA Y-CA Z-CA -0.776 360.0 8.1 -84.5 125.5 -14.7 34.4 34.8 ....;....1....;....2....;....3....;....4....;....5....;....6....;....7.. .-- sequential resnumber, including chain breaks as extra residues | .-- original PDB resname, not nec. sequential, may contain letters | | .-- amino acid sequence in one letter code | | | .-- secondary structure summary based on columns 19-38 | | | | xxxxxxxxxxxxxxxxxxxx recommend columns for secstruc details | | | | .-- 3-turns/helix | | | | |.-- 4-turns/helix | | | | ||.-- 5-turns/helix | | | | |||.-- geometrical bend | | | | ||||.-- chirality | | | | |||||.-- beta bridge label | | | | ||||||.-- beta bridge label | | | | ||||||| .-- beta bridge partner resnum | | | | ||||||| | .-- beta bridge partner resnum | | | | ||||||| | |.-- beta sheet label | | | | ||||||| | || .-- solvent accessibility | | | | ||||||| | || | # RESIDUE AA STRUCTURE BP1 BP2 ACC | | | | ||||||| | || | 35 47 I E + 0 0 2 36 48 R E > S- K 0 39C 97 37 49 Q T 3 S+ 0 0 86 (example from 1EST) 38 50 N T 3 S+ 0 0 34 39 51 W E < -KL 36 98C 6
Line length of output is 13x characters. Lines end in a number or a period.
Histograms:
the number 2 under column '8' in line 'residues per alpha helix' means: there are 2 alpha helices of length 8 residues in this data set.
For definitons, see above BIOPOLYMERS article.
In addition note:
two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the data set and including chain breaks; this number is used to refer to residues throughout. Second column gives crystallographers' 'residue sequence number','insertion code' and 'chain identifier' (see protein data bank file record format manual), given for reference only.
one letter amino acid code, lower case for SS-bridge CYS.
compromise summary of secondary structure, intended to approximate crystallographers' intuition, based on columns 19-38, which are the principal result of DSSP analysis of the atomic coordinates.
residue number of first and second bridge partner followed by one letter sheet label
number of water molecules in contact with this residue *10. or residue water exposed surface in Angstrom**2.
hydrogen bonds; e.g. -3,-1.4 means: if this residue is residue i then N-H of I is h-bonded to C=O of I-3 with an electrostatic H-bond energy of -1.4 kcal/mol. There are two columns for each type of H-bond, to allow for bifurcated H-bonds.
cosine of angle between C=O of residue I and C=O of residue I-1. For alpha-helices, TCO is near +1, for beta-sheets TCO is near -1. Not used for structure definition.
virtual bond angle (bend angle) defined by the three C-alpha atoms of residues I-2,I,I+2. Used to define bend (structure code 'S').
virtual torsion angle (dihedral angle) defined by the four C-alpha atoms of residues I-1,I,I+1,I+2.Used to define chirality (structure code '+' or '-').
IUPAC peptide backbone torsion angles
echo of C-alpha atom coordinates
The values for solvent exposure may not mean what you think!
Unknown or unusual residues are named X on output and are not checked for standard number of sidechain atoms. All explicit water molecules, like other hetatoms, are ignored.
Coordinate file in PDB format.
© June 21 2000 G Vriend