Abstract:
Proteins are important building blocks of life. Proteins play a vital role by
performing a wide variety of functions inside the cell. The structure of a protein
is an important determinant of its function, and is largely dependent on its amino
acid sequence. Therefore, structure prediction from the sequence can help us design
novel proteins that may be useful in medicine (e.g. therapeutic proteins) as well
as in industry (e.g. antibodies with lower aggregation propensity). Prediction of
protein structures from sequence is a major challenge and methods for modelling
protein structures require a good structure evaluation criteria both for evaluating
initial models as well as for refining them further.
In this study, we discuss the development of a novel protein structure evalua-
tion method that evaluates local regions in structures by comparing them to known
regions in the Protein Data Bank (PDB). It then calculates how well represented
in the PDB, is the amino acid environment of the region being evaluated, and the
conformation of its atoms in 3D. We have demonstrated here that the method may
be used to differentiate between the local regions from obsolete structures in the
PDB, and their refined versions, with a high level of confidence. We also com-
pared proteins from thermophilic and mesophilic organisms and could successfully
differentiate between them approximately 70% of the time. We noted a significant
correlation between our evaluation of the protein structures and their melting tem-
peratures. Since the method directly compares against known native structures
and evaluates local regions, it may be used for identifying regions that need to be
targetted first for structure refinement.