Answer:


For FF1: get a set of proteins that have at least one TM helix and a set of proteins that do not have any TM helices. These proteins must NOT have been used in the design of FF1! Predict TM helices and count the correctly predicted TM helices, the missed TM helices, and the over-predicted TM helices. Find a statistical method that makes you method look better than anybody else's, and publish.

And for FF2, it sounds rather similar, just that you now only need one test-set.