Publisher

MIMD Lattice Computation (MILC) collaboration

Built By

AMD

Multi-GPU Support

true

Description

The MILC Code is a set of research codes for doing simulations of four dimensional SU(3) lattice gauge theory on MIMD parallel machines.

The MILC Code is a set of research codes developed by MIMD Lattice Computation (MILC) collaboration for doing simulations of four dimensional SU(3) lattice gauge theory on MIMD parallel machines scaling from single-processor workstations to HPC systems. The MILC Code is publicly available for research purposes. Publications of work done using this code or derivatives of this code should acknowledge this use. Usage conditions.

MILC is written in C and can be run efficiently in parallel using a combination of multi-threading, MPI, and HIP/CUDA.

The most up-to-date information and access to the MILC Code can be found at:

For more information about MILC, visit

For more information on the AMD ROCm™ open software platform and access to an active community discussion on installing, configuring, and using ROCm, please visit the ROCm web pages at www.AMD.com/ROCm and ROCm Community Forum.

 

Single-Node Server Requirements


CPUs

GPUs

Operating Systems

ROCm™ Driver

Container Runtimes

X86_64 CPU(s)

AMD Instinct™ MI200 GPU(s)

AMD Instinct™ MI100 GPU(s)

Ubuntu 20.04

RHEL 8.2

ROCm v5.x compatibility

Docker Engine

Singularity 3.5

 

Note: The MILC application container assumes that the server contains the required x86-64 CPU(s) and at least one of the listed AMD GPUs. Also, the server must have one of the required operating systems and the listed ROCm driver version installed to run the Docker container. The server must also have a Docker Engine installed to run the container. Please visit the Docker Engine install web site at https://docs.docker.com/engine/install/ to install the latest Docker Engine for the operating system installed on the server. If Singularity use is planned, please visit https://sylabs.io/docs/ for the latest Singularity install documentation.

Please visit https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html for ROCm installation procedures and validation checks.

Known Issues

Multi-GPU runs with MI100 GPUs using ROCm 5.x GPU driver, MILC benchmarks may freeze and not return with an exit code upon completion of the benchmark.

 

Running Containers


For running MILC containers, please enable transparent huge pages (THP) on your system. On common systems you may enable THP temporarily by running:

sudo sh -c 'echo always > /sys/kernel/mm/transparent hugepage/enabled'

Please consult your operating system documentation for details.

Launch the container using the following command, substituting the image name from the Pull Command section above if newer:

docker run --rm -it --device /dev/dri --device /dev/kfd --security-opt seccomp=unconfined amdih/milc:c30ed15e1 /bin/bash

 

Running Application


The container contains an example benchmark problem, which can be executed as follows

cd /benchmark
run-benchmark -o milc-benchmark.out

The benchmark will be executed and run on a single GPU. File milc-benchmark.out will contain the results of the benchmark including total execution time. Depending on hardware specs, the single GPU benchmark may take over an hour to complete.

Please note that in the first iteration, a tuning step will be run. Rerunning the above command will use the generated tune files and you can expect an improved performance in the second run.

You can run the same benchmark using 2, 4, and 8 GPU. For example, to run using 2 GPU:

cd /benchmark
run-benchmark --ngpus 2 -o milc-benchmark_gpu2.out

RUN USING SINGULARITY

This section assumes that an up-to-date version of Singularity is installed on your system and properly configured for your system. Please consult with your system administrator or view official Singularity documentation.

Pull and convert docker image to singularity image format, substituting the image name from the Pull Command section above if newer:

singularity pull milc_c30ed15e1.sif docker://amdih/milc:c30ed15e1

You can then use examples from the preceding section to use the image. For example, to run the benchmark problem, you launch a container:

singularity run --pwd /benchmark --writable-tmpfs milc_c30ed15e1.sif /bin/bash

Then run the benchmarks as in the previous section. For example, using a single GPU:

run-benchmark --ngpus 1 -o bench-gpu1.txt

Or using 2 GPU:

run-benchmark --ngpus 2 -o bench-gpu2.txt

KNOWN ISSUES

Multi-GPU runs with MI100 and MI200 GPUs using ROCm 5.x GPU driver, MILC benchmarks may freeze and not return with an exit code upon completion of the benchmark.

 

Licensing Information


Your use of this application is subject to the terms of the applicable component-level license identified below. To the extent any subcomponent in this container requires an offer for corresponding source code, AMD hereby makes such an offer for corresponding source code form, which will be made available upon request. By accessing and using this application, you are agreeing to fully comply with the terms of this license. If you do not agree to the terms of this license, do not access or use this application.

The application is provided in a container image format that includes the following separate and independent components: Ubuntu (License: Creative Commons CC-BY-SA version 3.0 UK licence), MILC (License: BSD-3), CMAKE (License: BSD-3 Clause), OpenMPI (License: BSD 3-Clause), OpenUCX (License: BSD-3 Clause), QIO (License: Custom), QMP (License: Custom), QUDA (License: Custom), ROCm (License: Custom/MIT/Apache V2.0/UIUC OSL). Additional third-party content in this container may be subject to additional licenses and restrictions. The components are licensed to you directly by the party that owns the content pursuant to the license terms included with such content and is not licensed to you by AMD. ALL THIRD-PARTY CONTENT IS MADE AVAILABLE BY AMD “AS IS” WITHOUT A WARRANTY OF ANY KIND. USE OF THE CONTAINER IS DONE AT YOUR SOLE DISCRETION AND UNDER NO CIRCUMSTANCES WILL AMD BE LIABLE TO YOU FOR ANY THIRD-PARTY CONTENT. YOU ASSUME ALL RISK AND ARE SOLELY RESPONSIBLE FOR ANY DAMAGES THAT MAY ARISE FROM YOUR USE OF THE CONTAINER.

 

Disclaimer


The information contained herein is for informational purposes only, and is subject to change without notice. In addition, any stated support is planned and is also subject to change. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale.

 

Notices and Attribution


© 2023 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Instinct, Radeon Instinct, ROCm, and combinations thereof are trademarks of Advanced Micro Devices, Inc.

Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries. Docker, Inc. and other parties may also have trademark rights in other terms used herein. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.

All other trademarks and copyrights are property of their respective owners and are only mentioned for informative purposes.