The experimental techniques for the determination of the 3D structure of biological macromolecules have significantly progressed recently. As a consequence, the number of known 3D structures increases continuously, with about 35,000 structures currently available.
However, this still only covers a small fraction of the proteome. Therefore, it is of major interest to use in silico approaches to create theoretical models of protein structures that will be used to study the protein's structure/function relationships and direct further experimental work.
One class of methods that can be used to generate an atom-based structural model of a protein from its amino-acid sequence is called homology modeling. This technique is based on the observation that protein tertiary structure is better conserved than amino acid sequence. The consequence of this is that proteins sharing a significant similarity of sequence can be expected to share also a significant similarity of structure.
The homology modeling procedure can be broken down into several steps. First, template structures are selected. These templates consist of proteins sharing a significant similarity of sequence with the targeted protein (hopefully more than 30% of identity of sequence) and for which experimental 3D structures are available. Then the sequences of the targeted protein and templates are aligned. Based on the sequence alignments and 3D structure of the template, geometrical criteria can be generated that are used to generate a 3D structural model of the targeted protein. Finally, this structural model is assessed according to statistical potentials or physics-based energy calculations.