Protein structural alignment

From Academic Kids

(Redirected from Protein threading)

Protein structural alignment is a form of alignment which tries to establish equivalences between two or more protein structures based on their fold. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins in the so called "twilight zone" and "midnight zone" of homology, where relationships between proteins can't be detected by sequence alignment methods. The method can therefore be used to establish evolutionary relationships between proteins that share no or nearly no common primary structure. This is especially important in the light of structural genomics and proteomics projects. The result of a structural alignment of two proteins is a superposition of their atomic coordinate sets with a minimal root mean square deviation (RMSD) between the two structures.


Visualization of Structural Alignment

Missing image
Structural Alignment/Superimposition Schematic by MAMMOTH (Ortiz et al.) of two Immunoglobolin fold structures.

How similar are two Immunoglobulin structures? Use any one of the available structural alignment algorithms (see Packages) to superimpose two protein structures.

From structural alignment, you can extract percent of structural idenity (PSI), structurally implied sequence alignment, root mean square deviation (RMSD), and a score of the alignment.

The PSI can be easily calculated by normalizing the number of aligned residues by the length of the shortest structure (<math> N/norm <math>) where N is the number of the corresponded residues that are within a Cartesian distance of 4 Å; and "norm" is the normalization factor. In the Immunoglobulin example the the number of aligned residues within 4 Å is 57 and the norm of the set is 83, the PSI is therefore 68.67%.

Structurally implied sequence alignment is a one dimensional representation of the structural alignment.

          ||||||||||||     |||||||||||           |||||                   |||||||||||||||  ||||||||            

RMSD is then calculated by using the distances between the corresponding residues in the alignment.


Up to now there is no definitive algorithmic solution to protein structural alignment. It could be shown that the alignment problem is NP-hard. All current algorithms employ heuristic methods. Therefore different algorithms may not produce exactly the same results for the same alignment problem.

Representation of structures

Protein structures have to be represented in some coordinate independent space to make them comparable. One possible representation is the so-called distance matrix, which is a two-dimensional matrix containing all pairwise distance between all Cα atoms of the protein backbone. This can also be represented as a set of overlapping sub-matrices spanning only fragments of the protein. Another possible representation is the reduction of the protein structure to the level of secondary structure elements (SSEs), which can be represented as vectors, and can carry additional information about relationships to other SSEs, as well as about certain biophysical properties.

Comparison and Optimization

In the case of distance matrix representation, the comparison algorithm breaks down the distance matrices into regions of overlap, which are then again combined if there is overlap between adjacent fragments, thereby extending the alignment. If the SSE representation is chosen, there are several possibilities. One can search for the maximum ensemble of equivalent SSE pairs using algorithms to solve the maximum clique problem from graph theory. Other approaches employ dynamic programming or combinatorial simulated annealing.


Several tools for pairwise and multiple structural alignments are available on the web:

NAME Description Class Type Link Author Year
MAMMOTH MAtching Molecular Models Obtained from Theory Pair server ( AR. Ortiz 2002
CE/CE-MC Combinatorial Extension -- Monte Carlo Multi server ( I. Shindyalov 2000
DaliLite Distance Matrix Alignment Contact Map Pair server ( L. Holm 1993
VAST Vector Alignment Search Tool SSE Pair server ( S. Bryant 1996
PrISM Protein Informatics Systems for Modeling SSE Multi server ( B. Honig 2000
SSAP Sequential Structure Alignment Program SSE Multi server ( C. Orengo 1989
SARF2 Spatial Arrangements of Backbone Fragments SSE Pair server ( D. Fischer 1996
KENOBI/K2 NASSEPair server ( Z. Weng 2000
STAMP STructural Alignment of Multiple Proteins Sequence Pair server ( G. Barton 1992
MASS Multiple Alignment by Secondary Structure SSE Multi server ( R. Nussinov 2003
MALECON NA Geometry Multi NA S. Wodak 2004
MultiProt NA Geometry Multi server ( R. Nussinov 2004
SCALI Structural Core ALIgnment of proteins Sequence Pair server ( C. Bystroff 2004
DEJAVU NA SSE Pair server ( GJ. Kleywegt 1997
SSM Secondary Structure Matching SSE/Cα Pair NA E. Krissinel 2003
SHEBA Structural Homology by Environment-Based Alignment Sequence Pair server ( B. Lee 2000
LGA Local-Global Alignment Sequence Pair server ( A. Zemla 2003
POSA Partial Order Structure Alignment Multi server ( A. Gozik 2005
Matras MArkovian TRAnsition of protein Structure Cα & SSE Pair NA K. Nishikawa 2000
MAMMOTH-mult MAMMOTH-multiple structure alignment Multi server ( D. Lupyan 2005

Key map:

  • -- Backbone Atom (Cα) Alignment;
  • SSE -- Secondary Structure Elements Alignment;
  • Pair -- Pairwise Alignment (2 structures *only*);
  • Multi -- Multiple Structure Alignment (MStA);

See also


  • Bourne, P.E & Shindyalov, I.N. (2003): Structure Comparison and Alignment. In: Bourne, P.E., Weissig, H. (Eds): Structural Bioinformatics. Hoboken NJ: Wiley-Liss. ISBN 0-471-20200-2
  • Olmea O, Straus CE, Ortiz AR. (2002) MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci 11,2606-21
  • Yuan X, and Bystroff C.(2004) "Non-sequential Structure-based Alignments Reveal Topology-independent Core Packing Arrangements in Proteins", Bioinformatics. Nov 5, 2004
  • E. Krissinel and K. Henrick, Protein structure comparison in 3D based on secondary structure matching (SSM) followed by C-alpha alignment, scored by a new structural similarity function. In: A.J. Kungl and P.J. Kungl, Editors, Proceedings of the Fifth international Conference on Molecular Structural Biology, Vienna, September 3-7 (2003), p. 88.
  • Jung, J. and Lee, B.: Protein structure alignment using environmental profiles. Protein Engineering. 13:535-543, 2000.
  • Zemla A., "LGA - a Method for Finding 3D Similarities in Protein Structures", Nucleic Acids Research, 2003, Vol. 31, No. 13, pp. 3370-3374.
  • Y. Ye, A. Godzik "Multiple flexible structure alignment using partial order graphs" Bioinformatics, 2005,
  • T. Kawabata, K. Nishikawa "Protein structure comparison using the Markov transition model of evolution" Proteins; 41, 1, pp108-122
  • D. Lupyan, A. Leo-Macias, AR. Ortiz. "A new progressive-iterative algorithm for multiple structure alignment." Bioinformatics, 2005,, Epub Jun 7th

Academic Kids Menu

  • Art and Cultures
    • Art (
    • Architecture (
    • Cultures (
    • Music (
    • Musical Instruments (
  • Biographies (
  • Clipart (
  • Geography (
    • Countries of the World (
    • Maps (
    • Flags (
    • Continents (
  • History (
    • Ancient Civilizations (
    • Industrial Revolution (
    • Middle Ages (
    • Prehistory (
    • Renaissance (
    • Timelines (
    • United States (
    • Wars (
    • World History (
  • Human Body (
  • Mathematics (
  • Reference (
  • Science (
    • Animals (
    • Aviation (
    • Dinosaurs (
    • Earth (
    • Inventions (
    • Physical Science (
    • Plants (
    • Scientists (
  • Social Studies (
    • Anthropology (
    • Economics (
    • Government (
    • Religion (
    • Holidays (
  • Space and Astronomy
    • Solar System (
    • Planets (
  • Sports (
  • Timelines (
  • Weather (
  • US States (


  • Home Page (
  • Contact Us (

  • Clip Art (
Personal tools