Introduction

As neural networks become larger and more complex, the demand for higher compute density with lower power continues to grow. Discover how AMD XDNA™ NPU architecture addresses this imperative.

Image Zoom
placeholder

AMD XDNA - AI Engine

AMD XDNA is a spatial dataflow NPU architecture consisting of a tiled array of AI Engine processors. Each AI Engine tile includes a vector processor, a scalar processor, and local data and program memories. Unlike traditional architectures that require repeatedly fetching data from caches (which consumes energy), AI Engine uses on-chip memories and custom dataflow, to enable efficient, low power computing for AI and signal processing.

Inside the Tile

Each AI Engine tile consists of a VLIW (Very Long Instruction Word), SIMD (Single Instruction Multiple Data) vector processor optimized for machine learning and advanced signal processing applications. The AI Engine processor can run up over 1.3GHz, enabling efficient, high throughput, low latency functions. Each tile also contains program and local memory to store data, weights, activations, and coefficients; a RISC scalar processor and different modes of interconnect to handle different types of data communication.

Image Zoom
placeholder

AMD XDNA 2

Next-Generation AMD XDNA 2 is built for Generative AI experiences in PCs, delivering exceptional compute performance, bandwidth, and energy efficiency. 

Benefits

Big data analytics through machine learning, Artificial Intelligence concept background
Software Programmable

The AMD NPU is programmable and compiles in minutes. It also leverages a library-based design to simplify the workflow for ML framework developers.

Digital brain illustration on dark blue background, artificial intelligence
Deterministic

The AMD NPU includes dedicated instruction and data memories, as well as dedicated connectivity paired with DMA engines for scheduled data movement using connectivity between AI Engine tiles.

Cyber big data flow. Blockchain data fields. Network line connect stream
Efficient

The AMD NPU delivers greater compute density compared with traditional architectures and drives exceptional power efficiency.

Generative ai embossed mesh representing internet connections in cloud computing
Scalable

The AMD NPU is architected as 2D arrays consisting of multiple AI Engine tiles, enabling scaling from 10s to 100s of AI Engine tiles in a single device, servicing the compute needs of a breadth of applications.

Related Products

AMD Versal™

AMD Versal™ Adaptive SoCs

AMD Versal adaptive SoCs deliver outstanding performance for a wide variety of embedded applications—from automotive, communications, data center, industrial, and beyond.