Material linked from bioinformatics course


The following text is borrowed from the website about the Murthy fraud.

The visual observation seems to suggest that the Murthy B-factors are selected randomly selected from a series of numbers that are not normally distributed between some reasonable upper and lower limit. These limits typically are about 20 and 40.

Visual inspection clearly shows that nine Murthy files have fabricated B-factors, one or two are likely to have borrowed B-factors, while another two or three files perhaps have been solved experimentally (but not very skillfully from a crystallographic point of view).

So I need an algorithm that detects the B-factor selection method used by Murthy. I have tried a few things (Fourier transform on the B-factor distribution, etc), but they failed on the non-Gaussian distributions.

The algorithm that is fully objective (not written to explicitly detect Murthy files) goes as follows: