Scores for assessing predicted structures

Scores, so many scores

It’s worth recapping many of the different scores that Alphafold (and other related structure prediction tools) generally use to assess confidence, since these are also key scores that are used for generating and filtering de novo binder designs.

pLDDT - predicted “local distance test”

  • based on the lDDT-Cα score for comparing local Cα distances between predicted and experimental structures
  • ‘Real’ lDDT-Cα scores are based on the average fraction of preserved distances between each Cα atom in four distance thresholds of 0.5 Å, 1 Å, 2 Å, and 4 Å
  • Alphafold2 was trained to predict lDDT-Cα as a measure of local confidence by using real lDDT values between ground truth experimental structures and its predictions.
NotepLDDT Interpretation

Range: 0 → 100 (or sometimes expressed as 0.0 → 1.0).

Higher is more confident.

pAE - predicted aligned error

  • the confidence in the relative position of each residue pair in the predicted structure, measured in Ångströms (Å)
  • the expected positional error at residue X, if the predicted and ground truth structures were aligned on residue Y.
  • Usually visualized as a 2D heatmap of the predicted aligned error for each residue pair.
NotepAE Interpretation

Range: 0.0 → ~30 Å (Ångströms)

Not a single value, but often averaged over groups of residue pairs (intra or inter-chain).

Lower suggests a more accurate prediction.

pae_interaction - inter-chain predicted aligned error

  • sometimes called iPAE, or i_pAE
  • The average PAE score for inter-chain residue pairs in the PAE matrix, between target and binder.
  • Eg, the average pAE every pair that is not and intra-domain PAE score.
  • a measure of confidence in the relative arrangement of the target and binder.
  • the ‘other’ iPAE
    • others have used the name iPAE defined as the median PAE of interface residues (where interface residues are those < 3.5 Å from the binding partner). The tools in the workshop don’t use this score.
Notepae_interaction Interpretation

Range: 0.0 → 100 (or sometimes expressed as 0.0 → 1.0).

Higher is more confident.

PAE Matrix for pae_interaction Residue Index (Rows) 0 Binder Target Residue Index (Columns) 0 Binder Target Intra-Binder PAE pae_interaction1 pae_interaction2 Intra-Target PAE
paeinteraction = mean( paeinteraction1 + paeinteraction2 )


pTM - predicted template modelling (TM) score

  • an overall measure of structure accuracy - Alphafold2 (ptm) models, and Alphafold Multimer models, are trained to predict TM-score between the predicted and experimental structures.

  • the pTM score is derived from the values in the pAE matrix

  • The original non-predicted TM-score is based on the C\(\alpha\) distances between the predicted and experimental structures, with poorly superimposing C\(\alpha\) atoms downweighted, and a scaling factor to normalise for the length of the protein (for the best superimposition, that maximises the TM-score).

  • TM-score is less sensitive to outliers than RMSD (flexible loops and tails have a smaller impact), and independent of protein length

NotepTM Interpretation

Range: 0.0 → 1.0

Higher is more confident

  • \(< ~0.17\) is completely random

  • \(> 0.5\) is likely the same fold

  • \(1.0\) is an identical structure

ipTM - interface predicted template modelling (TM) score

  • like the pTM score, but considers only pairs of residues between chains, not within chains (inter-chain pairs only)

  • a measure of confidence in the relative arrangement of the target and binder.

  • a variant, ipSAE, proposed by Roland Dunbrack, is calculated in a similar way to ipTM but only considers the most confident pAE pairs in the calculation (ef pAE < 10). This reduces the impact of low confidence ‘disordered’ regions on the ipTM score.

NoteipTM Interpretation

Range: 0.0 → 1.0

Higher is more confident

  • \(> 0.6\), might be a real complex

  • \(> 0.8\), probably a real complex

C\(\alpha\)-RMSD - root mean square deviation of C\(\alpha\) atoms

  • calculated between two superimposed structures - the average distance (Å) between all selected C\(\alpha\) atoms in the two structures

  • can be sensitive to outliers (flexible loops and tails), but often outliers are excluded from the calculation (eg ChimeraX ‘matchmaker’ reports RMSD values for ‘pruned’ and ‘all’ pairs of atoms)

  • depends non-linearly on protein length (longer proteins will tend to have higher RMSD values)

NoteRMSD (C\(\alpha\)) Interpretation

Range: 0.0 Å → … ∞ Å ?

Lower is more confident


TipPredicting affinity

To date, none of these scores (or others) have been found to strongly associate with binding affinity in the high affinity ranges (<<10 μM) that we are usually interested in.

Many do tend to associate with boolean binding/non-binding (eg predicting less than or greater than 10 μM affinity), so they are still useful for guiding and filtering designs in silico.

Resources