What’s Next?
With more than 400 world records, you might be wondering how AMD can improve upon AMD EPYC™ processors.1 How do you go forward from some of the most powerful server processors ever created?1 You do it with ‘Zen 5’ architecture and all the benefits it delivers to customers.
Introducing 5th Generation AMD EPYC™ Processors
Designed for the world of AI and critical business workloads, 5th Generation AMD EPYC™ processors deliver the next generation of server CPUs in a family that has already set hundreds of world records for performance and efficiency.1 Building on that rich history, AMD EPYC™ 9005 Series processors enable breakthrough performance, thanks to their ‘Zen 5’ architecture.
Offering up to 192 cores, 384 threads, and 5GHz Max Boost frequencies, customers can expect these new processors to support virtually any business need they can imagine.2 Not only do they offer more cores than previous generation EPYC products and even higher frequencies, but support for faster DRAM as well, providing enhanced capability for memory-sensitive workloads.
Readily accessible and easily deployable, servers making use of 5th Generation AMD EPYC processors offer leadership performance, density, and efficiency, supporting deployments in everything from corporate AI-enablement initiatives and business-critical applications to providing the capability to power large-scale cloud-based infrastructures.
It’s a lineup unified by the familiar x86 software compatibility, letting customers deploy what they need, where they need it, with a common ISA that supports day-to-day business without the requirement for major x86 software modifications.
Systems based on the AMD EPYC 9005 processor will support various initiatives ranging from data center consolidation and modernization to increasingly demanding enterprise application needs. That’s all thanks to the highly efficient ‘Zen 5’ architecture, a compelling platform purpose-built to not only accommodate expanding AI needs within the enterprise space, but also to support businesses in their goal to improve energy efficiency and to rein in data center sprawl.
It’s a server CPU that holds nothing back – in performance, in efficiency, or in outcome.
Model # |
Cores |
Max Threads |
L3 Cache (MB) |
Default TDP (W) |
DDR Channels / Max Memory Capacity System (2DPC) |
Max DDR5 Freq (MHz) (1DPC) |
PCIe® Gen 5 (lanes) |
Socket Density |
9965 |
192 |
384 |
384 |
500 |
12 / 9TB |
6000 |
160 |
2 |
9845 |
160 |
320 |
320 |
400 |
12 / 9TB |
6000 |
160 |
2 |
9825 |
144 |
288 |
384 |
400 |
12 / 9TB |
6000 |
160 |
2 |
9755 |
128 |
256 |
512 |
500 |
12 / 9TB |
6000 |
160 |
2 |
9745 |
128 |
256 |
256 |
400 |
12 / 9TB |
6000 |
160 |
2 |
9655 |
96 |
192 |
384 |
400 |
12 / 9TB |
6000 |
160 |
2 |
9645 |
96 |
192 |
256 |
320 |
12/9TB |
6000 |
160 |
2 |
9655P |
96 |
192 |
384 |
320 |
12 / 9TB |
6000 |
128 |
1 |
9565 |
72 |
144 |
384 |
400 |
12 / 9TB |
6000 |
160 |
2 |
9575F |
64 |
128 |
256 |
400 |
12 / 9TB |
6000 |
160 |
2 |
9555 |
64 |
128 |
256 |
360 |
12 / 9TB |
6000 |
160 |
2 |
9555P |
64 |
128 |
256 |
320 |
12 / 9TB |
6000 |
128 |
1 |
9535 |
64 |
128 |
256 |
300 |
12 / 9TB |
6000 |
160 |
2 |
9475F |
48 |
96 |
256 |
360 |
12 / 9TB |
6000 |
160 |
2 |
9455 |
48 |
96 |
256 |
300 |
12 / 9TB |
6000 |
160 |
2 |
9455P |
48 |
96 |
192 |
300 |
12 / 9TB |
6000 |
128 |
1 |
9365 |
36 |
72 |
192 |
300 |
12 / 9TB |
6000 |
160 |
2 |
9375F |
32 |
64 |
256 |
320 |
12 / 9TB |
6000 |
160 |
2 |
9355 |
32 |
64 |
256 |
280 |
12 / 9TB |
6000 |
160 |
2 |
9355P |
32 |
64 |
256 |
280 |
12 / 9TB |
6000 |
128 |
1 |
9335 |
32 |
64 |
192 |
210 |
12 / 9TB |
6000 |
160 |
2 |
9275F |
24 |
48 |
256 |
320 |
12 / 9TB |
6000 |
160 |
2 |
9255 |
24 |
48 |
128 |
200 |
12 / 9TB |
6000 |
160 |
2 |
9175F |
16 |
32 |
256 |
320 |
12 / 9TB |
6000 |
160 |
2 |
9135 |
16 |
32 |
128 |
200 |
12 / 9TB |
6000 |
160 |
2 |
9125 |
8 |
16 |
256 |
165 |
12 / 9TB |
6000 |
160 |
2 |
9015 |
8 |
16 |
64 |
155 |
12 / 9TB |
6000 |
160 |
2 |
Performance: Painting a Clear Picture for Customers
As AI plays an increasingly important role in business, customers need to know they can rely on their server infrastructure to get it done – in addition to their existing workloads.
New AMD EPYC processors, like the AMD EPYC™ 9575F processor, deliver double-digit gains in instruction-per-clock-cycle (IPC) performance compared to the prior generation, and the latest 'Zen 5' core in 5th Gen AMD EPYC processors is designed to deliver significant uplifts in ML, HPC, and enterprise workloads.3
When compared to competitor products, these new processors help enterprises achieve incredible results, such as groundbreaking end-to-end AI throughput performance on a wide variety of use cases. For example, on the TPCx-AI benchmark, 2P servers with 192C AMD EPYC™ 9965 processors deliver up to ~3.8x more AI test cases per minute vs. 2P servers with 64C Intel Xeon Platinum 8592+ in AMD testing.4
When hosting GPU accelerators, two AMD EPYC 9575F CPUs achieve up to 20% more inference requests and 15% faster training time compared to two Intel® Xeon® 8592+ CPUs running Llama3.1.5,6
While performance shines, that doesn’t mean efficiency is off the table; AMD EPYC 9005 Series processors offer energy-efficient server solutions. In fact, 2P servers using AMD EPYC 9965 CPUs deliver 1.8x more estimated integer performance per CPU watt than those with Intel® Xeon® 8592+ CPUs.7
A new generation of cutting-edge AMD EPYC processors is here to transform the way work gets done. In the age of AI, your customers can’t afford to fall behind. Reach out to your AMD representative or visit AMD.com to learn more.
AMD Arena
Enhance your AMD product knowledge with training on AMD Ryzen™ PRO, AMD EPYC™, AMD Instinct™, and more.
Subscribe
Get monthly updates on AMD’s latest products, training resources, and Meet the Experts webinars.

Related Articles
Related Training Courses
Related Webinars
Footnotes
For a complete list of performance world records held by AMD EPYC processors visit amd.com/worldrecords.
Max boost for AMD EPYC processors is the maximum frequency achievable by any single core on the processor under normal operating conditions for server systems. EPYC-018
9xx5-001: Based on AMD internal testing as of 9/10/2024, geomean performance improvement (IPC) at fixed-frequency.
- 5th Gen EPYC CPU Enterprise and Cloud Server Workloads generational IPC Uplift of 1.170x (geomean) using a select set of 36 workloads and is the geomean of estimated scores for total and all subsets of SPECrate®2017_int_base (geomean ), estimated scores for total and all subsets of SPECrate®2017_fp_base (geomean), scores for Server Side Java multi instance max ops/sec, representative Cloud Server workloads (geomean), and representative Enterprise server workloads (geomean).
“Genoa” Config (all NPS1): EPYC 9654 BIOS TQZ1005D 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-4800 (2Rx4 64GB), 32Gbps xGMI;
“Turin” config (all NPS1): EPYC 9V45 BIOS RVOT1000F 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-6000 (2Rx4 64GB), 32Gbps xGMI
Utilizing Performance Determinism and the Performance governor on Ubuntu® 22.04 w/ 6.8.0-40-generic kernel OS for all workloads.
- 5th Gen EPYC generational ML/HPC Server Workloads IPC Uplift of 1.369x (geomean) using a select set of 24 workloads and is the geomean of representative ML Server Workloads (geomean), and representative HPC Server Workloads (geomean).
“Genoa Config (all NPS1) “Genoa” config: EPYC 9654 BIOS TQZ1005D 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-4800 (2Rx4 64GB), 32Gbps xGMI;
“Turin” config (all NPS1): EPYC 9V45 BIOS RVOT1000F 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-6000 (2Rx4 64GB), 32Gbps xGMI
Utilizing Performance Determinism and the Performance governor on Ubuntu 22.04 w/ 6.8.0-40-generic kernel OS for all workloads except LAMMPS, HPCG, NAMD, OpenFOAM, Gromacs which utilize 24.04 w/ 6.8.0-40-generic kernel.
SPEC® and SPECrate® are registered trademarks for Standard Performance Evaluation Corporation. Learn more at spec.org.
9xx5-012: TPCxAI @SF30 Multi-Instance 32C Instance Size throughput results based on AMD internal testing as of 09/05/2024 running multiple VM instances. The aggregate end-to-end AI throughput test is derived from the TPCx-AI benchmark and as such is not comparable to published TPCx-AI results, as the end-to-end AI throughput test results do not comply with the TPCx-AI Specification.
2P AMD EPYC 9965 (384 Total Cores), 12 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled)
2P AMD EPYC 9755 (256 Total Cores), 8 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled)
2P AMD EPYC 9654 (192 Total cores) 6 32C instances, NPS1, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power)
Versus 2P Xeon Platinum 8592+ (128 Total Cores), 4 32C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled)
Results:
CPU Median Relative Generational
Turin 192C, 12 Inst 6067.531 3.775 2.278
Turin 128C, 8 Inst 4091.85 2.546 1.536
Genoa 96C, 6 Inst 2663.14 1.657 1
EMR 64C, 4 Inst 1607.417 1 NA
Results may vary due to factors including system configurations, software versions and BIOS settings. TPC, TPC Benchmark and TPC-C are trademarks of the Transaction Processing Performance Council.
9xx5-014: Llama3.1-70B inference throughput results based on AMD internal testing as of 09/01/2024.
Llama3.1-70B configurations: TensorRT-LLM 0.9.0, nvidia/cuda 12.5.0-devel-ubuntu22.04 , FP8, Input/Output token configurations (use cases): [BS=1024 I/O=128/128, BS=1024 I/O=128/2048, BS=96 I/O=2048/128, BS=64 I/O=2048/2048]. Results in tokens/second.
2P AMD EPYC 9575F (128 Total Cores ) with 8x NVIDIA H100 80GB HBM3, 1.5TB 24x64GB DDR5-6000, 1.0 Gbps 3TB Micron_9300_MTFDHAL3T8TDP NVMe®, BIOS T20240805173113 (Determinism=Power,SR-IOV=On), Ubuntu 22.04.3 LTS, kernel=5.15.0-117-generic (mitigations=off, cpupower frequency-set -g performance, cpupower idle-set -d 2, echo 3> /proc/syss/vm/drop_caches) ,
2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x NVIDIA H100 80GB HBM3, 1TB 16x64GB DDR5-5600, 3.2TB Dell Ent NVMe® PM1735a MU, Ubuntu 22.04.3 LTS, kernel-5.15.0-118-generic, (processor.max_cstate=1, intel_idle.max_cstate=0 mitigations=off, cpupower frequency-set -g performance ), BIOS 2.1, (Maximum performance, SR-IOV=On),
I/O Tokens Batch Size EMR Turin Relative
128/128 1024 814.678 1101.966 1.353
128/2048 1024 2120.664 2331.776 1.1
2048/128 96 114.954 146.187 1.272
2048/2048 64 333.325 354.208 1.063
For average throughput increase of 1.197x.
Results may vary due to factors including system configurations, software versions and BIOS settings.
9xx5-015: Llama3.1-8B (BF16, max sequence length 1024) training testing results based on AMD internal testing as of 09/05/2024.
Llama3.1-8B configurations: Max Sequence length 1024, BF16, Docker: huggingface/transformers-pytorch-gpu:latest
2P AMD EPYC 9575F (128 Total Cores ) with 8x NVIDIA H100 80GB HBM3, 1.5TB 24x64GB DDR5-6000, 1.0 Gbps 3TB Micron_9300_MTFDHAL3T8TDP NVMe®, BIOS T20240805173113 (Determinism=Power,SR-IOV=On), Ubuntu 22.04.3 LTS, kernel=5.15.0-117-generic (mitigations=off, cpupower frequency-set -g performance, cpupower idle-set -d 2, echo 3> /proc/syss/vm/drop_caches) ,
For 31.79 Train Samples/Second
2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x NVIDIA H100 80GB HBM3, 1TB 16x64GB DDR5-5600, 3.2TB Dell Ent NVMe® PM1735a MU, Ubuntu 22.04.3 LTS, kernel-5.15.0-118-generic, (processor.max_cstate=1, intel_idle.max_cstate=0 mitigations=off, cpupower frequency-set -g performance ), BIOS 2.1, (Maximum performance, SR-IOV=On),
For 27.74 Train Samples/Second
For average throughput increase of 1.146.
Results may vary due to factors including system configurations, software versions and BIOS settings.
9xx5-002a: SPECrate®2017_int_base comparison based on internal estimated AMD reference platform measurements and published scores from www.spec.org as of 09/5/2024.
Comparison of 2P AMD EPYC 9965 (2870 estimated SPECrate®2017_int_base, 384 Total Cores, 500W TDP) 1.5TB 24x64GB 2Rx4 PC5-6400B-R running at 6000 MT/s, 3.84TB NVMe, Ubuntu® 24.04 LTS Kernel 6.8.30-41-generic, AOCC v5.0.0, 5.740 est SPECrate®2017_int_base/CPU W)
2P Intel Xeon Platinum 8592+ (1130 SPECrate®2017_int_base, 128 Total Cores, 350W TDP) 3.229 SPECrate®2017_int_base/CPU W, http://spec.org/cpu2017/results/res2023q4/cpu2017-20231127-40064.html)
EPYC 9965 vs 8592+
- estimated 2.540x the performance
- 1.778x the est performance/CPU W
Published 2P AMD EPYC 9754 (1950 SPECrate®2017_int_base, 256 Total Cores, 360W TDP) 5.417 SPECrate®2017_int_base/CPU W, http://spec.org/cpu2017/results/res2023q2/cpu2017-20230522-36617.html)
EPYC 9754 vs 8592+
- 1.725x the performance
- 1.678x performance/CPU W
Generational (EPYC 9965 vs EPYC 9754)
- is 1.472x the performance
- at 1.060x the performance/CPU W
SPEC®, SPEC CPU®, and SPECrate® are registered trademarks of the Standard Performance Evaluation Corporation. See www.spec.org for more information. Intel CPU TDP at https://ark.intel.com/.
For a complete list of performance world records held by AMD EPYC processors visit amd.com/worldrecords.
Max boost for AMD EPYC processors is the maximum frequency achievable by any single core on the processor under normal operating conditions for server systems. EPYC-018
9xx5-001: Based on AMD internal testing as of 9/10/2024, geomean performance improvement (IPC) at fixed-frequency.
9xx5-012: TPCxAI @SF30 Multi-Instance 32C Instance Size throughput results based on AMD internal testing as of 09/05/2024 running multiple VM instances. The aggregate end-to-end AI throughput test is derived from the TPCx-AI benchmark and as such is not comparable to published TPCx-AI results, as the end-to-end AI throughput test results do not comply with the TPCx-AI Specification.
9xx5-014: Llama3.1-70B inference throughput results based on AMD internal testing as of 09/01/2024.
9xx5-015: Llama3.1-8B (BF16, max sequence length 1024) training testing results based on AMD internal testing as of 09/05/2024.
9xx5-002a: SPECrate®2017_int_base comparison based on internal estimated AMD reference platform measurements and published scores from www.spec.org as of 09/5/2024.
- 5th Gen EPYC CPU Enterprise and Cloud Server Workloads generational IPC Uplift of 1.170x (geomean) using a select set of 36 workloads and is the geomean of estimated scores for total and all subsets of SPECrate®2017_int_base (geomean ), estimated scores for total and all subsets of SPECrate®2017_fp_base (geomean), scores for Server Side Java multi instance max ops/sec, representative Cloud Server workloads (geomean), and representative Enterprise server workloads (geomean).
“Genoa” Config (all NPS1): EPYC 9654 BIOS TQZ1005D 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-4800 (2Rx4 64GB), 32Gbps xGMI;
“Turin” config (all NPS1): EPYC 9V45 BIOS RVOT1000F 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-6000 (2Rx4 64GB), 32Gbps xGMI
Utilizing Performance Determinism and the Performance governor on Ubuntu® 22.04 w/ 6.8.0-40-generic kernel OS for all workloads.
- 5th Gen EPYC generational ML/HPC Server Workloads IPC Uplift of 1.369x (geomean) using a select set of 24 workloads and is the geomean of representative ML Server Workloads (geomean), and representative HPC Server Workloads (geomean).
“Genoa Config (all NPS1) “Genoa” config: EPYC 9654 BIOS TQZ1005D 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-4800 (2Rx4 64GB), 32Gbps xGMI;
“Turin” config (all NPS1): EPYC 9V45 BIOS RVOT1000F 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-6000 (2Rx4 64GB), 32Gbps xGMI
Utilizing Performance Determinism and the Performance governor on Ubuntu 22.04 w/ 6.8.0-40-generic kernel OS for all workloads except LAMMPS, HPCG, NAMD, OpenFOAM, Gromacs which utilize 24.04 w/ 6.8.0-40-generic kernel.
SPEC® and SPECrate® are registered trademarks for Standard Performance Evaluation Corporation. Learn more at spec.org.
2P AMD EPYC 9965 (384 Total Cores), 12 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled)
2P AMD EPYC 9755 (256 Total Cores), 8 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled)
2P AMD EPYC 9654 (192 Total cores) 6 32C instances, NPS1, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power)
Versus 2P Xeon Platinum 8592+ (128 Total Cores), 4 32C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled)
Results:
CPU Median Relative Generational
Turin 192C, 12 Inst 6067.531 3.775 2.278
Turin 128C, 8 Inst 4091.85 2.546 1.536
Genoa 96C, 6 Inst 2663.14 1.657 1
EMR 64C, 4 Inst 1607.417 1 NA
Results may vary due to factors including system configurations, software versions and BIOS settings. TPC, TPC Benchmark and TPC-C are trademarks of the Transaction Processing Performance Council.
Llama3.1-70B configurations: TensorRT-LLM 0.9.0, nvidia/cuda 12.5.0-devel-ubuntu22.04 , FP8, Input/Output token configurations (use cases): [BS=1024 I/O=128/128, BS=1024 I/O=128/2048, BS=96 I/O=2048/128, BS=64 I/O=2048/2048]. Results in tokens/second.
2P AMD EPYC 9575F (128 Total Cores ) with 8x NVIDIA H100 80GB HBM3, 1.5TB 24x64GB DDR5-6000, 1.0 Gbps 3TB Micron_9300_MTFDHAL3T8TDP NVMe®, BIOS T20240805173113 (Determinism=Power,SR-IOV=On), Ubuntu 22.04.3 LTS, kernel=5.15.0-117-generic (mitigations=off, cpupower frequency-set -g performance, cpupower idle-set -d 2, echo 3> /proc/syss/vm/drop_caches) ,
2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x NVIDIA H100 80GB HBM3, 1TB 16x64GB DDR5-5600, 3.2TB Dell Ent NVMe® PM1735a MU, Ubuntu 22.04.3 LTS, kernel-5.15.0-118-generic, (processor.max_cstate=1, intel_idle.max_cstate=0 mitigations=off, cpupower frequency-set -g performance ), BIOS 2.1, (Maximum performance, SR-IOV=On),
I/O Tokens Batch Size EMR Turin Relative
128/128 1024 814.678 1101.966 1.353
128/2048 1024 2120.664 2331.776 1.1
2048/128 96 114.954 146.187 1.272
2048/2048 64 333.325 354.208 1.063
For average throughput increase of 1.197x.
Results may vary due to factors including system configurations, software versions and BIOS settings.
Llama3.1-8B configurations: Max Sequence length 1024, BF16, Docker: huggingface/transformers-pytorch-gpu:latest
2P AMD EPYC 9575F (128 Total Cores ) with 8x NVIDIA H100 80GB HBM3, 1.5TB 24x64GB DDR5-6000, 1.0 Gbps 3TB Micron_9300_MTFDHAL3T8TDP NVMe®, BIOS T20240805173113 (Determinism=Power,SR-IOV=On), Ubuntu 22.04.3 LTS, kernel=5.15.0-117-generic (mitigations=off, cpupower frequency-set -g performance, cpupower idle-set -d 2, echo 3> /proc/syss/vm/drop_caches) ,
For 31.79 Train Samples/Second
2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x NVIDIA H100 80GB HBM3, 1TB 16x64GB DDR5-5600, 3.2TB Dell Ent NVMe® PM1735a MU, Ubuntu 22.04.3 LTS, kernel-5.15.0-118-generic, (processor.max_cstate=1, intel_idle.max_cstate=0 mitigations=off, cpupower frequency-set -g performance ), BIOS 2.1, (Maximum performance, SR-IOV=On),
For 27.74 Train Samples/Second
For average throughput increase of 1.146.
Results may vary due to factors including system configurations, software versions and BIOS settings.
Comparison of 2P AMD EPYC 9965 (2870 estimated SPECrate®2017_int_base, 384 Total Cores, 500W TDP) 1.5TB 24x64GB 2Rx4 PC5-6400B-R running at 6000 MT/s, 3.84TB NVMe, Ubuntu® 24.04 LTS Kernel 6.8.30-41-generic, AOCC v5.0.0, 5.740 est SPECrate®2017_int_base/CPU W)
2P Intel Xeon Platinum 8592+ (1130 SPECrate®2017_int_base, 128 Total Cores, 350W TDP) 3.229 SPECrate®2017_int_base/CPU W, http://spec.org/cpu2017/results/res2023q4/cpu2017-20231127-40064.html)
EPYC 9965 vs 8592+
- estimated 2.540x the performance
- 1.778x the est performance/CPU W
Published 2P AMD EPYC 9754 (1950 SPECrate®2017_int_base, 256 Total Cores, 360W TDP) 5.417 SPECrate®2017_int_base/CPU W, http://spec.org/cpu2017/results/res2023q2/cpu2017-20230522-36617.html)
EPYC 9754 vs 8592+
- 1.725x the performance
- 1.678x performance/CPU W
Generational (EPYC 9965 vs EPYC 9754)
- is 1.472x the performance
- at 1.060x the performance/CPU W
SPEC®, SPEC CPU®, and SPECrate® are registered trademarks of the Standard Performance Evaluation Corporation. See www.spec.org for more information. Intel CPU TDP at https://ark.intel.com/.