Experimental Validation

So, you’ve designed some de novo binders. You’ve filtered them based on some in silico scores and other criteria to your preferences and have a ranked set of ~96 sequences.

Gene synthesis, protein purification and experimental assays

What next ? You’ll need to consider:

  • Affinity tag placement
    • Most likely you’ll need an affinity tag - which termini should this be on for each design ?
  • Padding for gene synthesis
    • Depending on the service provider, you may need to add additional non-coding padding sequence to your construct (eg Twist might require a minimum insert size of 300bp).
  • Expression system
    • What vector system do you need to use to produce your protein for assay ?
    • Will you clone the fragment library yourself, or will the service provider do this for you ?
    • What vectors are available from the service provider ?
  • How will you express and purify ~96 proteins ?
  • How will you assess expression level and solubility ?
  • What level of purity, concentration and buffer conditions do you need for your assay ?

Many of these considerations will be familiar to experimental structural biologists, but it’s worth considering the practicalities of how you will deal with scaling up from a handful of constructs to ~96 or more.

Expected experimental ‘hit’ rates

How do we define successful binding ? A typical definition used is Kd <= 10uM.

RFDiffusion with published in silico scoring thresholds, has a success rate of between 7% to 35% (depending on the target).

BindCraft has published success rate of between 10% to 100% (depending on the target).