Live Webinar

AMD EPYC™ Technology Transforming Enterprise AI Challenges

Learn how the winning combination of AMD EPYC™ processors and industry-leading GPU accelerators provide the muscle needed to tackle the most demanding Enterprise AI challenges.

AMD EPYC

AMD EPYC™ Processors help Maximize the Value of Large GPU Investments

GPU accelerators have become the workhorse for modern AI, excelling in training large, complex models and supporting efficient real-time inference at scale. However, maximizing the potential of your GPU investment requires a powerful CPU partner.

Why GPUs for AI Workloads?

GPUs are the right tool for many AI workloads.

  • AI Training: GPUs accelerate the training of large and medium-sized models with their parallel processing capabilities.
  • Dedicated AI Deployments: GPUs offer the speed and scalability needed for real-time inference in large-scale deployments

The CPU Advantage:

Combining the power of GPUs with the right CPU can significantly enhance AI efficiency for certain workloads. Look for these key CPU features:

  • High Frequency and Core Count: Handles extensive data preparation and post-processing tasks quickly and efficiently.
  • Large Cache Size: facilitates fast data access to massive datasets.
  • High Memory Bandwidth and High Performance I/O: Enables fast seamless data exchange between CPU and GPU.
  • Energy-Efficient Cores: Frees up power for GPU usage and can help reduce overall energy consumption.
  • Compatibility with GPU and Software Ecosystem Enables optimized performance, efficiency, and smooth operation.
GPU System

AMD EPYC processors

Your ideal choice for unlocking the true potential of your large AI workloads. They help maximize GPU accelerator performance and overall AI workload efficiency.  Plus, with advanced security features and a long, consistent commitment to open standards, AMD EPYC processors enable businesses to confidently deploy the next phase in their AI journey.  

Applications and Industries

GPU accelerator-based solutions fueled by AMD EPYC CPUs power many of the world's fastest supercomputers and cloud instances, offering enterprises a proven platform for optimizing data-driven workloads and achieving groundbreaking results in AI.

AMD EPYC CPUs: The Right Choice to Maximize the Value of Large GPU Investments

CPUs play a crucial role in orchestrating and synchronizing data transfers between GPUs, handling kernel launch overheads, and managing data preparation. This "conductor" function ensures that GPUs operate at peak efficiency.

Optimize GPU Investment Value With High Performance CPUs

Some workloads benefit from high CPU clock speeds to enhance GPU performance by streamlining data processing, transfer, and concurrent execution, fueling GPU efficiency.

To prove the concept that higher CPU frequencies boost Llama2-7B workload throughput, we used custom AMD EPYC 9554 CPUs in a 2P server equipped with 8x NVIDIA H100 GPUs1

Llama2-7B Fine Tuning
Relative Performance at 2.0GHz
1.0x
Relative Performance at 2.5GHz
1.12x
Relative Performance at 3.0GHz
1.28x

Llama2-7B Training (1K sequence length)
Relative Performance at 2.0GHz
1.0x
Relative Performance at 2.5GHz
1.16x
Relative Performance at 3.0GHz
1.2x

Llama2-7B Training (2K sequence length)
Relative Performance at 2.0GHz
1.0x
Relative Performance at 2.5GHz
1.1x
Relative Performance at 3.0GHz
1.14x

Deploy Enterprise AI Efficiently

Processors that combine high performance, low power consumption, efficient data handling, and effective power management capabilities enable your AI infrastructure to operate at peak performance while optimizing energy consumption and cost.

AMD EPYC processors power the world’s most energy-efficient servers, delivering exceptional performance and helping reduce energy costs.2 Deploy them with confidence to create energy-efficient solutions and help optimize your AI journey.

In AMD EPYC 9004 Series processors AMD Infinity Power Management offers excellent default performance and allows fine-tuning for workload-specific behavior.

Abstract illustration with glowing blue lines

Peace of Mind: Adopt AI With Trusted Solutions

Choose from several certified or validated GPU-accelerated solutions hosted by AMD EPYC CPUs to supercharge your AI workloads.

Prefer AMD Instinct accelerator-powered solutions?

Using other GPUs? Ask for AMD EPYC CPU powered solutions  available from leading platform solution providers including Asus, Dell, Gigabyte, HPE, Lenovo, and Supermicro.

Growing Ecosystem of AMD EPYC CPU + GPU Cloud AI/ML Instance Options

Ask for instances combining AMD EPYC CPU with GPUs for AI/ML workloads from major cloud  providers including AWS, Azure, Google, IBM Cloud, and OCI.

server room photo

Resources

AMD Instinct Accelerators

Uniquely well-suited to advance your most demanding AI workloads.

AMD EPYC Enterprise AI Briefs

Find AMD and partner documentation describing AI and Machine Learning Innovation using CPUs and GPUs

Podcasts

Listen to leading technologists from AMD and the industry discussing the latest trending topics regarding servers, cloud computing, AI, HPC and more.

Footnotes
  1. SP5-292: Llama2-7B fine-tuning and training throughput results based on AMD internal proof-of-concept testing as of 6/15/2024.

    Server configurations: 2P EPYC 9554 (CPU with customized frequencies, 64C/128T, 16 cores active), 1.5TB Memory (24x 64GB DDR5-5600 running at 4800 MT/s), 3.2 TB SSD, Ubuntu® 22.04.4 LTS, with 8x NVIDIA H100 80GB HBM3, HuggingFace Transformers v 4.31.0, NVIDIA PyTorch 23.12, PEFT 0.4.0, Python 3.10.12, CUDA 12.3.2.001, TensorRT-LLM v 0.9.0.dev2024, CUDNN 8.9.7.29+cuda12.2, NVIDIA-SMI Driver version 550.54.15, TRT v8.6.1.6+cuda12.0.1.011, Transformer Engine v1.1

    Llama2-7B Fine Tuning: BS per device=4, seqln=128, avg over 4 runs, 10 epochs per run, FP16

    Llama2-7B Training (1K): BS=56 (7x8 GPUs), seqln=1k, Gradients on GPU

    Llama2-7B Training (2K): BS=24 (3x8 GPUs), seqln=2k, Gradients on GPU

    Results:

    CPU Freq              2000 MHz           2500 MHz           3000 MHz

    Fine Tuning Avg Train Run Time Seconds 649.38 584.24 507.1

    % Throughput Increase  0.00% 11.15% 28.06%

    Training Throughput 1K Sequence Length 276.08 238.81 230.82

    % Throughput Increase  0.00% 15.61% 19.61%

    Training Throughput 2K Sequence Length 883.85 807.94 778.72

    % Throughput Increase  0.00% 9.40% 13.50%

    Results may vary due to factors including system configurations, software versions and BIOS settings. NOTE: This performance is Proof of Concept. Data collected on 2P custom AMD EPYC™ 9554 as Host Processor with various frequencies utilizing 8x Nvidia H100 80GB accelerators. 4th Gen EPYC Processors do not allow end users to adjust frequencies

  2. EPYC-028D: SPECpower_ssj® 2008, SPECrate®2017_int_energy_base, and SPECrate®2017_fp_energy_base based on results published on SPEC’s website as of 2/21/24. VMmark® server power-performance / server and storage power-performance (PPKW) based results published at https://www.vmware.com/products/vmmark/results3x.1.html?sort=score. The first 105 ranked SPECpower_ssj®2008 publications with the highest overall efficiency overall ssj_ops/W results were all powered by AMD EPYC processors. For SPECrate®2017 Integer (Energy Base), AMD EPYC CPUs power the first 8 top SPECrate®2017_int_energy_base performance/system W scores. For SPECrate®2017 Floating Point (Energy Base), AMD EPYC CPUs power the first 12 SPECrate®2017_fp_energy_base performance/system W scores. For VMmark® server power-performance (PPKW), have the top 5 results for 2- and 4-socket matched pair results outperforming all other socket results and for VMmark® server and storage power-performance (PPKW), have the top overall score. See https://www.amd.com/en/claims/epyc4#faq-EPYC-028D for the full list. For additional information on AMD sustainability goals see: https://www.amd.com/en/corporate/corporate-responsibility/data-center-sustainability.html. More information about SPEC® is available at http://www.spec.org. SPEC, SPECrate, and SPECpower are registered trademarks of the Standard Performance Evaluation Corporation. VMmark is a registered trademark of VMware in the US or other countries.