Secondary structure counting table

I counted in a couple tens of thousands protein-containing PDB files (that each have at least one helix of length 9 and one helix of length 11) the frequencies of residues in relation to secondary structure (I stopped counting when I saw that the total number of amino acids analyzed was much larger then 10 million.
 type .................  frequencies  .................
        Total      N  Centr      C      H      S   Rest    Glob     N   centr   C       N   centr   C       H     S    Rest     H     S    Rest
ALA   1094323 149448 311988 143147 604583 150206 339534    0.09    0.10  0.13  0.12    0.11  0.41  0.37    0.12  0.07  0.06    0.32 -0.27 -0.30
CYS    167962  13257  30789  14240  58286  40890  68786    0.01    0.01  0.01  0.01   -0.44 -0.04 -0.07    0.01  0.02  0.01   -0.15  0.30 -0.02
ASP    741569 123923  91581  42797 258301  72921 410347    0.06    0.08  0.04  0.04    0.31 -0.43 -0.45    0.05  0.03  0.08   -0.15 -0.60  0.28
GLU    865155 194994 176603  94276 465873 103633 295649    0.07    0.12  0.07  0.08    0.61  0.07  0.18    0.09  0.04  0.06    0.29 -0.40 -0.20
PHE    519113  55085 104085  47379 206549 131493 181071    0.04    0.04  0.04  0.04   -0.14  0.05  0.01    0.04  0.06  0.03   -0.01  0.34 -0.18
GLY    965213  81623  86311  28624 196558 117922 650733    0.08    0.05  0.04  0.02   -0.37 -0.75 -1.12    0.04  0.05  0.12   -0.68 -0.38  0.48
HIS    305960  33913  51773  27419 113105  53449 139406    0.02    0.02  0.02  0.02   -0.10 -0.12 -0.01    0.02  0.02  0.03   -0.09 -0.03  0.09
ILE    736743  60516 177647  63898 302061 236045 198637    0.06    0.04  0.07  0.05   -0.40  0.24 -0.05    0.06  0.10  0.04    0.02  0.58 -0.44
LYS    753027  89389 153885  99505 342779 103377 306871    0.06    0.06  0.06  0.09   -0.03  0.07  0.38    0.07  0.04  0.06    0.12 -0.27 -0.02
LEU   1185510 118346 320359 165602 604307 236992 344211    0.09    0.08  0.13  0.14   -0.20  0.35  0.43    0.12  0.10  0.06    0.23  0.11 -0.36
MET    275263  26486  72242  33344 132072  52691  90500    0.02    0.02  0.03  0.03   -0.24  0.32  0.29    0.03  0.02  0.02    0.17  0.06 -0.24
ASN    539453  52621  75724  40604 168949  58016 312488    0.04    0.03  0.03  0.03   -0.23 -0.30 -0.19    0.03  0.03  0.06   -0.25 -0.51  0.33
PRO    588379 105992  16083   5365 127440  45505 415434    0.05    0.07  0.01  0.00    0.39 -1.94 -2.30    0.02  0.02  0.08   -0.62 -0.84  0.53
GLN    467155  70052 110170  53358 233580  63274 170301    0.04    0.04  0.05  0.05    0.20  0.22  0.23    0.05  0.03  0.03    0.22 -0.28 -0.14
ARG    654959  71957 152548  80759 305264 101138 248557    0.05    0.05  0.06  0.07   -0.11  0.20  0.31    0.06  0.04  0.05    0.15 -0.15 -0.10
SER    719767  90264  93944  55652 239860 106411 373496    0.06    0.06  0.04  0.05    0.02 -0.38 -0.16    0.05  0.05  0.07   -0.19 -0.19  0.22
THR    699522  73993 101686  46335 222014 150384 327124    0.05    0.05  0.04  0.04   -0.15 -0.27 -0.31    0.04  0.07  0.06   -0.24  0.18  0.11
VAL    922987  81742 190914  63310 335966 324090 262931    0.07    0.05  0.08  0.05   -0.32  0.09 -0.28    0.06  0.14  0.05   -0.10  0.67 -0.38
TRP    184806  28418  36800  16613  81831  40298  62677    0.01    0.02  0.02  0.01    0.23  0.05 -0.01    0.02  0.02  0.01    0.09  0.19 -0.21
TYR    452117  48455  83349  43004 174808 115240 162069    0.04    0.03  0.03  0.04   -0.13 -0.03  0.05    0.03  0.05  0.03   -0.04  0.35 -0.15
In this table you find first seven rows of counts that are: The next columns all are relative frequencies and preference parameters. Relative freuqencies are simple countings divided by the total number, while preference parameters are calculated as log(Pobs/Ppred). In which Ppred is what one would expect in that column if all residues were divided randomly over all classes, while Pobs is what actually was observed. In a sense, such a preference parameter is -ΔG.
The Wikipedia has several of these tables included, but the one most prominently available in the Wiki is only for a surface position in the middle of a water-soluble helix. My table is simply for the PDB in general, and that is a better background for a statement about helices in general.

The first column is the simple frequency of each amino acid. So, the 0.09 for Ala means that 9% of all amino cids in the whole dataset is alanine. The table continues with four blocks of 3 columns. These are first the counting frequencies and preference parameters for the twenty amino acid types in the N-terminal 3, central portion, and C-terminal 3 positions in a helix. The last two blocks of three columns are the counting frequencies and preference parameters for residues in Helix, Strand, and neither of those two, respectively. The right hand three columns are the parameters that Chou and Fasman would have obtained if they had lived 30 years later.

We can draw a series of conclusions: Pro and Gly are the worst residues for a helix, followed by (Asn, Asp, Ser, and Thr). Several residue types have strong asymmetries over the first turn of the helix, and the rest of the helix. E.g. Cys looks bad for a helix while it is essentially helix neutral but is very bad in the first helical turn. The ooposite is seen for Asp. You can analyse the numbers for yourself.