The CMBI provided a series of facilities for protein structure bioinformatics. This page provides a brief summary of these facilities an gives pointers to the pages that hold the information (and often also the data, the software, etc) related to these facilities. |
Oops...
After the retirement of G Vriend it became increasingly difficult to keep a website of a few hundred Gbytes in the air at the Radboudumc. It was therefore decided to move these facilities to a cloud server at SurfSARA. Just by the time that everything was working fine in the cloud, SurfSARA decided to stop providing this service. After that G Vriend gave up. Some of the services are still there, but others (most noticeably the Lists) are gone. If you need any of the missing facilities, feel free to contact G Vriend at vriendgert @ gmail.com and he will see what he can do for you. |
At the bottom of the page we also list a few facilities that might be useful when studying disease causing mutations in the humen genome.
Most of the CMBI PDB-facilities have been described in:
Facilities that make the PDB data collection more powerful |
The WHAT IF software package was a powerful, albeit hard-to-use, tool for a wide variety of protein structure calculations and visualization. The more than 2,000,000 lines of code that make up WHAT IF allowed you to predict mutations, find errors in structures, superpose structures, determine cavities, calculate all normal things such as H-bonds, accessibilities, salt-bridges, surfaces, interactions, etc. WHAT IF was top-of-the-line 1995 software, and has now, 32 years after the first few lines of its code were typed, passed it expiration date. Nevertheless, WHAT IF still can jump through some hoops that are to difficult to jump through for most of its much more modern competitors. These aspects of WHAT IF have been (and some still will be) made available as web servers and/or web services.
The WHAT_CHECK software is a (also free) subset of the WHAT IF software package. WHAT_CHECK can be used to determine the ′quality′ of macromolecular structures. Input of the software is a file in PDB format, and the output is a comprehensive report with hundreds of notes, warnings, and (unfortunately) also errors. Although the WHAT_CHECK source code and databases are available, we suggest you use it via the WHAT IF web server page.
PDBREPORT Protein structures in the PDB were determined experimentally. They thus contain experimental errors. The PDBREPORT databank list millions and millions of errors and anomalies in WHAT_CHECK reports for each PDB entry..
B-factors in crystallographic structure models are important for many applications, especially in protein engineering, explaining the results of diagnostic genetic testing, and some fundamental aspects of protein structure bioinformatics (e.g. Molecular Dynamics).
Unfortunately, B-factors can be given in several different formats in PDB entries, and it will not be easy for non-crystallographers to figure out which are the 'real' (in crystallographic terms 'full isotropic') B-factors. The BDB databank contains PDB entries with their B-factors consistently presented in full isotropic format.
BDB access, download options, and help
Many PDB entries are either old, or created with old software. So, in principle, it should be possible to optimise them if the crystallographic reflection data are available. We have done this. More than two-thirds of all crystallographic PDB entries have been improved by re-refinement and rebuilding with more modern software. You can use the WHYNOT server to see if the entry you are interested in has also been redone. One warning though: The PDB_REDO database was produced 100% automatically; so there is a fair chance that here and there something has gone terribly wrong and remained unnoticed.
DSSP is probably the oldest software available in the entire field of protein structure bioinformatics. It determines the secondary structure given the three dimensional coordinates of a protein. So, it does not predict secondary structure. The DSSP software is freely available, and the CMBI makes sure that for every PDB entry and for every mmCIF entry there always, soon after the release of that entry, will exist a corresponding DSSP file.
There is much information to be gained from a structure, but even more from the combination of structure and a multiple sequence alignment (MSA). HSSP files provide MSAs for all PDB entries. They are a bit cumbersome to read, but good software exists to do that for you.
Reading PDB file headers is very cumbersome. That is mainly because the PDB was designed without ontology, without schema, and without a view for the future. Additionally, most information in the PDB file headers is entered by hand without validation by software. The PDBFINDER is our favorite way to deal with such problems. The PDBFINDER software has a large series of modules that each take care of one typical PDB file annotation problem. Additionally, the PDBFINDER contains data such as missing R-factors, etc., that we looked up by hand.
The PDB can hold hundreds of copies of nearly the same molecule. That is not bad, because the hundreds of lysozyme mutant structures, for example, teach us a lot about protein structures and stability. Bioinformaticians often want to be able to work with a representative dataset. It wouldn't be wise to train a computational method on a dataset that consists for more than ten percent of lysozyme structures because with that dataset the method doesn't tell anything about proteins, but it showswhat lysozyme looks like. Here PDB_SELECT comes in. PDB_SELECT holds representative datasets of sequence unique PDB files of a certain minimal R-factor and resolution.
Sometimes a PDB entry exists but the corresponding entry in the facilities describe above does not. There can be several reasons for this. For instance, DSSP and HSSP files can only be made for structures that contain protein and PDB_REDO only works for crystallographic PDB entries with deposited experimental data. Severe problems in PDB entries, either technical or scientific, may also stop us from providing the data for the facilities. Whatever the reason, you can use the WHY_NOT server to ask WHY a particular file does NOT exist.
Protein structure bioinformaticians often need some simple information like symmetry contacts, solvent accessibility, torsion angles, etc, for all files in the PDB. The Lists section of these facilities contains a series of directories that each contains one entry per PDB entry, and these entries contain those simple data types in easy-to-parse formats. These are not the results of highly sophisticated calculations but just simple lists like the accessible surface areas of residues, torsion angle tables, lists of contacts with ions, etc.
YASARA scenes are available for some lists to provide a quick-and-easy visualization of the information in lists entries.
YASARA is the best macromolecular viewer, so we use it to look at structures and we have created YASARA scenes that contain highlighted structural features in the best visualisation style and viewpoint. These scenes can be inspected by everybody with access to YASARA. In addition to the commercial version of YASARA, that comes with molecular modeling and -simulation functionality, there is a freely available viewer called YASARA_View, and this free viewer can deal with all scenes that we prepared for you.
Strictly speaking, MRS is not specifically a protein structure bioinformatics tool. But as it is very useful for finding things in the protein structure world, we list it here anyway. MRS is a fast, and smart, data retrieval tool.
HOPE, which stands for Have Our Protein Explained, will fully automatically analyze a point mutation in a human protein. (Although meant for disease causing human mutations, it will work for all proteins from all species. MRS
HOPE works best if the 3D structure is available for the protein in which the mutation has taken place. Often experimentally determined structure information is not yet available. For those cases we produced all homology models that we believed could be modelled reliably in a fully automatic fashion. To be made available soon
HOPE works best if the 3D structure is available for the protein in which the mutation has taken place. Often experimentally determined structure information is not yet available. For those cases we produced all homology models that we believed could be modelled reliably in a fully automatic fashion. To be made available soon