Loop Modelling 1.0


There exists an option in WHAT IF that scans the entire PDB (or PDB_REDO, or any subset of either of those two) for loops that could fit nicely in your protein.

With fitting I mean that the loop has the appropriate length, and that the N first and N last residues of the loop found in the PDB match the last N residues before and the first N residues after the point of insertion in your protein with a RMSd after optimal superposition less than your desired cut-off. N typically is 2, 3, or 4. The algorithm is conceptually explained in the figure to the left.


Unfortunately, the WHAT IF option to search for loops easily takes 6-12 hours CPU time to scan 120K PDB entries for just one loop. We therefore made dedicated filter software, called LoopFinder. LoopFinder typically scans the whole PDB in minutes rather than hours. LoopFinder runs over all PDB files and produces several output files that hold information about loops in the PDB that fit in your protein at some location where a loop is missing or should be replaced by protein engineering. One of the output files is a list with names of PDB files that WHAT IF should look at to find all loops that were already found by LoopFinder. So, LoopFinder is a fast screening filter for WHAT IF; but it can, of course, equally well be a fast filter for your software.

The output files

LoopFinder produces two output files (and a log-file; see below):

The input file

LoopFinder needs some information from you about what it must do. You have to provide that information in the file


The log file

LoopFinder writes a log file called LoopFinder.LOG. This file has no value other than that you need to send it to G Vriend when there is something you do not understand, or you believe is going wrong.

The database

To make its predictions, LoopFinder scans three of the files from its data directory. These files are big... LoopFinder starts with making an estimate of how many blocks of 100.000 lines need to be read from those files, and during the run tells you after every hit, and after each 100.000 lines how many blocks it has done already. The database is explained separately.

Installing software and databases

The LoopFinder was really designed to facilitate research by others. We therefore did everything possible to keep the set-up simple and flexible. Consequently, there are no installers, no environment variables, etcetera, just one source code and a set of files that we call the database.

To get your own LoopFinder, you must first obtain the file LoopFinder.f and compile it with the linux command:

gfortran -O2 -o LoopFinder LoopFinder.f

Second, you need the database(s). I suggest you start getting just the database for anchor length 2 and obtain the other database at some later stage. You can obtain the database files by executing 26 times the command:

wget http://swift2.cmbi.umcn.nl/gv/loops/data2/fort.xxx

in which xxx should take all values from 105 till 130. Be aware that most of these files are multiple gigabytes big, so do not sit and wait for the files to arrive. (You can try to use this script to obtain the database. Make sure you give the script execute permission chmod Get_LoopFinder_Database 777 and that you run it in the directory where the database files should finally reside. Use ./Get_LoopFinder_Database to run the script. Running multiple wget commands in parallel will slow down the download rather than speed it up).

Results and discussion

We tested the software extensively on a series of published studies in which loops were designed or transplanted. These examples are explained in the associated article.

Gert Vriend