Each sequence of inputting multiple sequences is in FASTA format consists of a single-line description, followed by lines of sequence data. The first character of the description line is a ">" symbol in the first column. For example:
Two sequences of inputting two sequences are in FASTA format. For example:
One sequence of inputting a sequence is in FASTA format. For example:
PDB (Protein Data Bank)
identifier is specified in a 4-character PDB assigned identifier.
Users must input a PDB code which is presented in PDB before Dec 25, 2009,
otherwise, the user needs to upload a protein 3D structure file with PDB format.
The E-value specifies the statistical significance of an alignment to obtain an indication of the reliability of the searching.
This setting is a threshold for reporting matching protein sequences against sequence database.
We followed previous works (Matthews et al., 2001;
Yu et al., 2004) to define 10-10 as the default value.
Operationally, homologous proteins can be defined as having an E-value 10-10 from BLAST.
If the E-value is greater than 10-10, the match will not be reported.
Lower E-value is more stringent, causing to fewer number of matches being reported.
The residues of homologous proteins in each protein complex are considered as a binomial distribution.
With a random set of residues in homologous protein, the binomial distribution is capable to approach normal distribution.
Then, the interacting residues can be estimated by the Z-score of homologous proteins.
Joint Z-score is a quantitative degree to measure the similarity between two protein complexes.
We define the joint complex similarity as
where Z1 denotes the Z-score of proteins 1 and its homolog 1'; and Zn is Z-score of proteins n and its homolog n' and so on.
|Homologous Complexes in Each Species ( Ranking by Joint Z-score )|
This server will return all of homologous complexes of each species, which are ranked by joint Z-score of each homologous complex.