AMD Ryzen AI Software banner

Overview

AMD Ryzen™ AI software includes the tools and runtime libraries for optimizing and deploying AI inference on AMD Ryzen AI powered PCs1. Ryzen AI software enables applications to run on the neural processing unit (NPU) built in the AMD XDNA™ architecture, the first dedicated AI processing silicon on a Windows x86 processor2, and supports an integrated GPU (iGPU).

画像を拡大
Ryzen developer flow diagram

Development Flow

Developing AI applications for Ryzen AI can be summarized in 3 easy steps:

Start with a Pre-trained Model
Use a pre-trained model in PyTorch or TensorFlow as your starting point. Then convert your model to the ONNX format, which is compatible with the Ryzen AI workflow.

Quantization
Quantize your model by converting its parameters from floating-point to lower precision representations, like 16-bit or 8-bit integers. The Vitis™ AI Quantizer for ONNX provides an easy-to-use Post Training Quantization (PTQ) flow for this purpose.

Deploy the Model
After quantization, your model is ready to be deployed on the hardware. Use ONNX Runtime with C++ or Python APIs to deploy the AI model. The Vitis AI Execution Provider included in ONNX Runtime optimizes workloads, ensuring optimal performance and lower power consumption.

Start Using Ryzen AI Software

Get Started

Ryzen AI Videos

Discover video tutorials focusing on how Ryzen AI 300 series PCs use both NPUs and integrated GPUs to accelerate large language model workloads.

What’s New 

1.4 Release Highlights

  • New Features
    • Unified Installer with both LLM and General Model Flow (INT8, BF16) -- a seamless experience with support for all model types in a single release package.
    • First release of Window support for BF16 model compilation and quantization for CNN and NLP workloads.
    • Support for LLM OGA Flow, making it easier to deploy LLMs efficiently.
  • New LLM Model Support
    • DeepSeek-R1 Distill Series: Llama-8B, Qwen-1.5B, Qwen-7B 
    • Qwen2 Series: Qwen2-1.5B, Qwen2-7B 
    • Gemma2-2B 
    • AMD-OLMO-1B-SFT-DPO  
    • Codellama-7B, Mistral-7B and more

1.3 Release Highlights

  • New Features 
    • Early support for AMD unified quantizer "Quark"
    • Support for mixed precision data types and Copilot+ apps
    • Updated CNN profiling tool 
  • New Model Support for ONNX-GenAI (OGA) Flow 
    • Llama2-7B-Chat / Meta-Llama-3.1-8B
    • Phi-3-Mini-4K-Instruct / Phi-3.5-Mini-Instruct
    • Mistral-7B-Instruct-v0.3

1.2 Release Highlights

  • New Architectural Support 
    • Support for Strix (STX): AMD Ryzen™ AI 9 HX370 & Ryzen AI 9 365 NPUs 
    • Unified support for integrated GPU (iGPU) and NPU through Ryzen AI software
  • New Early Access tools, models and features 
    • New Model analysis, profiling, and visualization tool for models running on the NPU (AI Analyzer)
    • New platform/NPU inspection and management tool (xrt-smi)
    • LLM flow support for multiple in both PyTorch and ONNX flow

1.1 Release Highlights

  • New model support:
    • Llama 2 7B with w4abf16 (3-bit and 4-bit) quantization (Beta)
    • Whisper base (Early access)
  • New EoU tools & features:
    • CNN Benchmarking tool on RyzenAI-SW Rep
    • Platform/NPU inspection and management tool

1.0 Release Highlights

  • Model Support
    • +1,000 validated CNN models
    • OPT-1.3B on NPU using PyTorch and ONNX flow
  • EoU tools & features
    • Supports ONNX PTQ (Post Training Quantization), PyTorch PTQ and QAT (Quantization Aware Training)
    • Supports ONNX Runtime Vitis AI Execution Provider with both C++ and Python APIs
    • Automatic scheduling of up to 8 simultaneous inference sessions on NPU

Open-Source Projects

Explore open-source tools from AMD that empower developers to analyze, optimize, and deploy AI models efficiently across diverse hardware.

Digest AI

Digest is a powerful model analysis tool designed to help you extract valuable insights from your machine learning models, enabling optimization and direct modification. 

GAIA

GAIA is a generative AI application that demonstrates a multi-agent RAG pipeline running private and local LLMs on CPU, GPU and NPU hardware.

TurnkeyML & Lemonade

TurnkeyML simplifies the use of tools within the ONNX ecosystem by offering no-code CLIs and low-code APIs. With Turnkey, you can export and optimize ONNX models for CNNs and Transformers. With Lemonade, you can serve and benchmark LLMs on CPU, GPU, and NPU.

Sign Up for Ryzen AI News

Keep up-to-date on the latest product releases, news, and tips.

脚注
  1. Ryzen™ AI technology is compatible with all AMD Ryzen™ 7040 series processors except the Ryzen™ 5 7540U and Ryzen™ 3 7440U. OEM enablement is required. Please check with your system manufacturer for feature availability prior to purchase. GD-220.
  2. As of May 2023, AMD has the first available dedicated AI engine on an x86 Windows processor, where 'dedicated AI engine' is defined as an AI engine that has no function other than to process AI inference models and is part of the x86 processor die. For detailed information, please check: https://www.amd.com/en/technologies/xdna.html. PHX-3a
  3. Based on testing by AMD as of 6/5/2023. Battery life results evaluated by operation of a simulated nine-participant Microsoft Teams video conference using a Ryzen™ 7940HS processor with  Ryzen™ AI and integrated Radeon graphics with Windows Studio Effects vs. NVIDIA Broadcast for AI-enhanced background blur and eye gaze correction features with NVIDIA GeForce RTX 4070 discrete graphics. AMD/NVIDIA systems run from power level 100% to > 5% @150nits brightness and power mode set to "power efficiency." System configurations: Razer Blade 14” laptop, AMD Ryzen™ 9 7940HS processor with Ryzen™ AI, Integrated AMD Radeon Graphics (22.40.03.24 driver), 16GB (8GBx2) LPDDR5, NVME SSD storage, Windows 11 Home 22H , NVIDIA GeForce RTX 4070 graphics (528.92 driver) with NVIDIA Broadcast.   System manufacturers may vary configurations, yielding different results. Results may vary. PHX-51