Hardware and software requirements

Hardware and software requirements

Or, “how can I run this at home ?”

Compute resources

What kind of hardware do I need ?

  • GPUs and VRAM are the limiting factor that put an upper bound on the size of the target+binder you can work with.

    • The larger the target+binder, the more VRAM you will need.
    • Alphafold2, used for both the RFdiffusion protocol and BindCraft, is usually the most VRAM intensive step.
    • For BindCraft:
      • 16 Gb VRAM (T4): ~250 residues
      • 32 Gb VRAM: ~550 residues
      • 80 Gb VRAM (A100-80G): ~950 residues
  • Adding CPUs and RAM has minimal impact on runtime or ability to tackle larger complexes, since most operations are GPU and VRAM-constrained, and not heavily CPU-multi-threaded (except maybe FastRelax).

  • Storage - as a guide, our full nf-binder-design runs (including intermediate ‘work’ files) are:

    • BindCraft, 400 trajectories: 15 Gb
    • RFdiffusion, 500 trajectories: 14 Gb
  • A local workstation with a single GPU can be useful for small test runs, finding parameters.

  • Practically, running in parallel on HPC or the cloud (eg Azure/AWS/GCS Batch) may be required to finish a typical production run in a reasonable time.

Software

Licensing

The foundation of these tools are available under under permissive open source licenses and can be used freely for non-commercial use. However, NOTE that PyRosetta, a component of both RFDiffusion and BindCraft, requires a license from the University of Washington for commercial use.

Installation

Warning

For this workshop, you don’t need to install anything. We’ve set things up to run using HPC modules (using Apptainer containers under the hood).

Installing these packages typically takes hours, and troubleshooting research software installation is a topic for an entirely different workshop.

You’ll want a Linux based operating system. You do not necessarily need admin / sudo permissions.

Methods, roughly in order of preference:

1. Engage your local HPC or research cloud admins to help create a shared installation everyone can use.

They are often experienced in troubleshooting installation and dependency issues that can be specific to the systems they maintain.

2. Use a container

Apptainer (formerly known as Singularity) simplifies deployment of software by bundling up the software and its dependencies, including a simple Linux system image, into a single container.

Docker is another popular alternative, but isn’t usually available on HPC systems due to security constraints.

Apptainer can run Docker images, so you rarely need Docker to run research software.

All the container images used in this workshop are available here and the Dockerfile recipes used to build them are available here.

3. Use a pipeline

A Nextflow pipeline like nf-binder-design can help manage running all the steps in an RFdiffusion - ProteinMPNN - Alphafold2 (or BindCraft) workflow.

The best pipelines typically use publically available Apptainer containers by default, so you don’t need to think about installing each of the tools yourself.

4. Install yourself

Follow the instrucions at https://github.com/RosettaCommons/RFdiffusion and https://github.com/martinpacesa/BindCraft - the installation steps are well documented, but don’t always work perfectly in every environment without some troubleshooting. Search the (open and closed) Github issues, and forks - often these contain clues for fixing common dependency issues.

At Monash

The HPC modules on the M3 HPC cluster used in this workshop will be available for everyone shortly after the workshop.

nf-binder-design has a configuration profile for the M3 HPC cluster (use -profile slurm,m3 or -profile slurm,m3_bdi)

At Unimelb

TODO: Spartan ?

At WEHI

TODO: Milton ?