RELION

Publisher

AMD

Built By

AMD

Multi-GPU Support

true

Description

RELION (REgularized LIkelihood OptimizatioN) implements an empirical Bayesian approach for analysis of electron cryo-microscopy (Cryo-EM).

Overview

RELION (REgularized LIkelihood OptimizatioN) implements an empirical Bayesian approach for analysis of electron cryo-microscopy (Cryo-EM). Specifically, RELION provides refinement methods of singular or multiple 3D reconstructions as well as 2D class averages. RELION is an important tool in the study of living cells.

RELION is comprised of multiple steps that cover the entire single-particle analysis workflow. Steps include beam-induced motion-correction, CTF estimation, automated particle picking, particle extraction, 2D class averaging, 3D classification, and high-resolution refinement in 3D. RELION can process movies generated from direct-electron detectors, apply final map sharpening, and perform local-resolution estimation. More information can be obtained from the official documentation.

Single-Node Server Requirements

CPUs	GPUs	Operating Systems	ROCm™ Driver	Container Runtimes
X86_64 CPU(s)	AMD Instinct™ MI200 GPU(s) AMD Instinct™ MI100 GPU(s)	Ubuntu 20.04 Red Hat 8	ROCm v5.x compatibility	Docker Engine Singularity

The RELION container assumes that the server contains the required x86-64 CPU(s) and at least one of the listed AMD GPUs. Also, the server must have one of the required operating systems and the listed ROCm driver version installed to run the Docker container. The server must also have a Docker Engine installed to run the container. Please visit the Docker Engine install web site at https://docs.docker.com/engine/install/ to install the latest Docker Engine for the operating system installed on the server. If Singularity use is planned, please visit https://sylabs.io/docs/ for the latest Singularity install documentation.

For ROCm installation procedures and validation checks, see:

Running Containers

Before launching the container, first grab the standard RELION benchmarks with the following instructions and extract them to your $HOME directory:

cd ~ wget ftp://ftp.mrc-lmb.cam.ac.uk/pub/scheres/relion_benchmark.tar.gz tar –xvf relion_benchmark.tar.gz

Using Docker

INTERACTIVE

Once the standard benchmarks are downloaded, launch the container interactively using the following command and substituting the image tag with the latest shown in the Pull Command:

docker run --rm -it --ipc=host --device /dev/dri --device /dev/kfd --security-opt seccomp=unconfined -v ${HOME}/relion_benchmark:/dataset:ro -e OMPI_ALLOW_RUN_AS_ROOT=1 -e OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 10.194.116.23:5000/aasg/relion:4.0 /bin/bash

There is a convenient benchmark script included. As an example, consider the standard 2D benchmark, which can be executed with the benchmark script as follows:

./run-benchmark --class 3d -g 8 -n 33 -j 1 -p 10 --iters 25 -i /dataset

In the above example, the benchmark is run with 33 (-n 33) MPI ranks, each using a single thread on a dual socket server with 8 (-g 8) GPUs and 128 physical cores. Tunable parameters include:

Args	description
--class	Choose benchmark (2D or 3D)
-g	# of GPUs (devices)
-n	# of MPI ranks
-j	# threads per MPI rank
-p	# jobs per thread (default:10)
--iters	# of iterations (default:25)
-i	Path of input dataset

Additional details about the usage of run-benchmark can be found by adding the -h which prints the different arguments that can be passed to the script. For example:

./run-benchmark -h

will print the usage as:

This script is designed run RELION Plasmodium Ribosome (2D/3D) benchmarks on GPUs. ================================= usage: ./run-benchmark --class <xy> -g <XY> -n <ZZ> -j <yy> -p <xx> -i <path-to-data> --iter <xyz> -o <path-to-output-dir> -h | --help Prints the usage -d | --gpu-support Support for GPU offloading (defaults to HIP, or select CUDA) -g | --ngpus Number of GPUs to be used (between 1-10, defaults to 1) -n | --ranks Number of MPI ranks -j | --threads Max. number of threads per proc -p | --pool Max. number of jobs per thread -o | --output-dir Output directory -i | --data Path to the dataset --class Select benchmark type (options 2d|3D - case insensitive) --iter(s) Specify the number of iterations to run for (default: 25) --continue Continue the benchmark from a prev. iter# (e.g. --continue 15)

A full list of tuneable parameters and configurations for the benchmark commands included in the run-benchmark script can be found in RELION documentation or benchmark wiki page.

NOTE: For a parallel run with multiple MPI ranks and GPUs, the leader MPI rank distributes work to the rest of the followers that are explicitly mapped to a given GPU device to offload work. Hence, the number of MPI ranks should be carefully selected such that # MPI ranks minus one is divisible by # GPUs, e.g.: if --ngpus=4, then -n=5,9,13, etc. However, the product of # MPI ranks and # threads must be less than or equal to the total # physical cores on the server. The run-benchmark script takes care of these settings and should return a warning/error if any of these conditions are violated.

NON-INTERACTIVE

Where applicable, you may also run the container non-interactively from the host. For example, you may run the 2d class benchmark example non-interactively:

Using 4 GPUs:

docker run --rm --ipc=host --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined -v ${HOME}/relion_benchmark:/dataset:ro -e OMPI_ALLOW_RUN_AS_ROOT=1 -e OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 10.194.116.23:5000/aasg/relion:4.0 run-benchmark --class 2d -g 4 -n 33 -j 3 -p 10 --iters 25 -i /dataset

Using 8 GPUs:

docker run --rm --ipc=host --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined -v ${HOME}/relion_benchmark:/dataset:ro -e OMPI_ALLOW_RUN_AS_ROOT=1 -e OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 10.194.116.23:5000/aasg/relion:4.0 run-benchmark --class 2d -g 8 -n 33 -j 3 -p 10 --iters 25 -i /dataset

Or, consider the 3D benchmark:

Using 4 GPUs:

docker run --rm --ipc=host --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined -v ${HOME}/relion_benchmark:/dataset:ro -e OMPI_ALLOW_RUN_AS_ROOT=1 -e OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 10.194.116.23:5000/aasg/relion:4.0 run-benchmark --class 3d -g 4 -n 17 -j 6 -p 10 --iters 25 -i /dataset

Using 8 GPUs:

docker run --rm --ipc=host --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined -v ${HOME}/relion_benchmark:/dataset:ro -e OMPI_ALLOW_RUN_AS_ROOT=1 -e OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 10.194.116.23:5000/aasg/relion:4.0 run-benchmark --class 3d -g 8 -n 17 -j 6 -p 10 --iters 25 -i /dataset

Using Singularity

This section assumes that an up-to-date version of Singularity is installed on your system and properly configured for your system. Please consult with your system administrator or view official Singularity documentation.

Pull and convert docker image to singularity image format using the following command and substituting the image tag with the latest shown in the Pull Command:

singularity pull relion.sif docker://10.194.116.23:5000/aasg/relion:4.0

You can then use examples from the preceding section to use the image. For example, to run the benchmark problems, launch a container in interactive-mode:

singularity run --bind ${HOME}/relion_benchmark:/dataset:ro --pwd /benchmark relion.sif /bin/bash

Then run the benchmarks as in the previous section.

Or run the singularity image in non-interactive mode. For example, to run the 2D benchmark:

Using 4 GPUs:

singularity run --bind ${HOME}/relion_benchmark:/dataset:ro --pwd /benchmark relion.sif run-benchmark --class 2d -g 4 -n 33 -j 3 -p 10 --iters 25 -i /dataset

Using 8 GPUs:

singularity run --bind ${HOME}/relion_benchmark:/dataset:ro --pwd /benchmark relion.sif run-benchmark --class 2d -g 8 -n 33 -j 3 -p 10 --iters 25 -i /dataset

Or, to run the 3D benchmark:

Using 4 GPUs:

singularity run --bind ${HOME}/relion_benchmark:/dataset:ro --pwd /benchmark relion.sif run-benchmark --class 3d -g 4 -n 17 -j 6 -p 10 --iters 25 -i /dataset

Using 8 GPUs:

singularity run --bind ${HOME}/relion_benchmark:/dataset:ro --pwd /benchmark relion.sif run-benchmark --class 3d -g 8 -n 17 -j 6 -p 10 --iters 25 -i /dataset

Performance considerations

Figure of Merit (FoM): Time elapsed in seconds to complete 25 iterations
FOM bigger is better (y/N)?: No
2D benchmark on 1 GPU can take as much as 6+hrs and dividing the workload between multiple GPUs and MPI ranks provides speedup. However, one must pay careful attention to the combination of MPI ranks, threads per rank and pool of jobs per thread used for the benchmark run. To determine optimal parameter values, a parameter sweep must be performed to identify the best configuration, varying MPI ranks and threads per rank.
A slightly older reference provides information on MPI task distribution and GPUs with RELION

Known Issues / Errata

RELION application typically needs a large local scratch disk space, ideally SSD or RamFS. The example presented needs at least 100 GB of scratch space for the benchmark data.
If you see "memory allocator issue" error, please add the following argument into your Relion run command --free_gpu_memory 30000.
In some cases, when the problem size is too big or the available memory on the GPU card is not large enough, the amount of free GPU memory requested above will be ignored, and a WARNING message will be displayed as shown below. In such situations, by default a safer lower-limit of 30% of total GPU memory is marked free instead and the simulation continues utilizing 70% of the GPU memory. One can safely ignore these warnings, or can choose to modify the amount of free memory requested through the --free_gpu_memory to get rid of the warnings.
=============================
Oversampling= 1 NrHiddenVariableSamplingPoints= 8601600
OrientationalSampling= 2.8125 NrOrientations= 512
TranslationalSampling= 1.34 NrTranslations= 84
=============================
WARNING: Ignoring required free GPU memory amount of 30600 MB, due to space insufficiency.
WARNING: Ignoring required free GPU memory amount of 30600 MB, due to space insufficiency.

Licensing Information

This Docker Container image is provided by Advanced Micro Devices, Inc. for your convenience and is made available subject to the AMD Container Image License (RELION - CONTAINER LICENSE AGREEMENT. Do not pull, install, copy, or use the image unless you agree to all the terms and conditions of the Container Image License.

Disclaimer

The information contained herein is for informational purposes only, and is subject to change without notice. In addition, any stated support is planned and is also subject to change. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale.

Notices and Attribution

© 2023 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Instinct, Radeon Instinct, ROCm, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries. Docker, Inc. and other parties may also have trademark rights in other terms used herein. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.

All other trademarks and copyrights are property of their respective owners and are only mentioned for informative purposes.

Data Center

Business Systems

Personal & Gaming

Embedded

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Workloads

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Adaptive SoCs, FPGAs, & SOMs

Graphics

Overview

Resources by Market Segment

Resources by Product

Resources by Type

About Our Partners

AMD Global Support

Processors & Graphics

Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

Publisher

Built By

Multi-GPU Support

Description

Single-Node Server Requirements

Running Containers

Using Docker

INTERACTIVE

NON-INTERACTIVE

Using Singularity

Performance considerations

Known Issues / Errata

Licensing Information

Disclaimer

Notices and Attribution

Company

News & Events

Community

Partners