[How-To] Automatic1111 Stable Diffusion WebUI with DirectML Extension on AMD GPUs

Nov 30, 2023

Prepared byHisham Chowdhury (AMD),Sonbol Yazdanbakhsh (AMD), Justin Stoecker (Microsoft), and Anirban Roy (Microsoft)

Microsoft and AMD continue to collaborate enabling and accelerating AI workloads across AMD GPUs on Windows platforms. We published an earlier article about accelerating Stable Diffusion on AMD GPUs using Automatic1111 DirectML fork.

Now we are happy to share that with ‘Automatic1111 DirectML extension’ preview from Microsoft, you can run Stable Diffusion 1.5 with base Automatic1111 with similar upside across AMD GPUs mentioned in our previous post

Bildzoom

Fig 1: up to 12X faster Inference on AMD Radeon™ RX 7900 XTX GPUs compared to non ONNXruntime default Automatic1111 path

Microsoft and AMD engineering teams worked closely to optimize Stable Diffusion to run on AMD GPUs accelerated via Microsoft DirectML platform API and AMD device drivers. AMD device driver resident ML acceleration layers utilize AMD Matrix Processing Cores via wavemma intrinsics to accelerate DirectML based ML workloads including Stable Diffusion, Llama2 and others.

Bildzoom

Fig 2:OnnxRuntime-DirectML on AMD GPUs

1.Prerequisites

Installed Git (Git for Windows)
Installed Anaconda/Miniconda (Miniconda for Windows)
- Ensure Anaconda/Miniconda directory is added to PATH
Platform having AMD Graphics Processing Units (GPU)
- Driver: AMD Software: Adrenalin Edition™ 23.11.1 or newer(https://www.amd.com/en/support)

2. Overview of Microsoft Olive

Olive is a Python tool that can be used to convert, optimize, quantize, and auto-tune models for optimal inference performance with ONNX Runtime execution providers like DirectML. Olive greatly simplifies model processing by providing a single toolchain to compose optimization techniques, which is especially important with more complex models like Stable Diffusion that are sensitive to the ordering of optimization techniques. The DirectML sample for Stable Diffusion applies the following techniques:

Model conversion:translates the base models from PyTorch to ONNX.
Transformer graph optimization:fuses subgraphs into multi-head attention operators and eliminating inefficient from conversion.
Quantization:converts most layers from FP32 to FP16 to reduce the model's GPU memory footprint and improve performance.

Combined, the above optimizations enable DirectML to leverage AMD GPUs for greatly improved performance when performing inference with transformer models like Stable Diffusion.

3. Automatic1111 WebUI DirectML Extension(Preview)

Follow these steps to enable DirectML extension on Automatic1111 WebUI and run with Olive optimized models on your AMD GPUs:

**only Stable Diffusion 1.5 is supported with this extension currently

**generate Olive optimized models using our previous post or Microsoft Olive instructions when using the DirectML extension

**not tested with multiple extensions enabled at the same time

Open Anaconda Terminal

conda create --name automatic_dmlplugin python=3.10.6
conda activate automatic_dmlplugin
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
webui.bat --lowvram --precision full --no-half --skip-torch-cuda-test

Open the Extensions tab

go to Install from URL and paste in this URL: https://github.com/microsoft/Stable-Diffusion-WebUI-DirectML
Click ‘install’

Copy the Unet model optimized by Olive to models\Unet-dml folder

example \models\optimized\runwayml\stable-diffusion-v1-5\unet\model.onnx -> stable-diffusion-webui\models\Unet-dml\model.onnx folder.

Return to the Settings Menu on the WebUI interface

Settings → User Interface → Quick Settings List, add sd_unet
Apply settings, Reload UI

Bildzoom

Navigate to the "Txt2img" tab of the WebUI Interface

Select the DML Unet model from the sd_unet dropdown

Bildzoom

Run your inference!

Bildzoom

Result is up to 12X faster Inference on AMD Radeon™ RX 7900 XTX GPUs compared to non-Olive-ONNXRuntime default Automatic1111 path.

Article By

Adit Bhutani

white pearl gradient medium color divider

Related Blogs

View All Blogs

Fußnoten

4. Disclaimers & Footnotes

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions, and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. Any computer system has risks of security vulnerabilities that cannot be completely prevented or mitigated. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. GD-18.Links to third-party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites, and no endorsement is implied. GD-98Testing conducted by AMD as of November 16th, 2023, on a test system configured with a Ryzen 9 7950X CPU, 32GB DDR5, Radeon RX 7900 XTX GPU, and Windows 11 Pro, with AMD Software: Adrenalin Edition 23.11.1, using the application Stable Diffusion 1.5 with Microsoft Olive under Automatic 1111 vs. Default Automatic 1111. Performance may vary. System manufacturers may vary configurations, yielding different results. RS-621Automatic1111 is an active branch which changes often, so the interfaces and setup may look slightly different depending on when the branch is downloaded.**not tested with multiple extensions enabled at the same timeATTRIBUTIONSTHIS INFORMATION IS PROVIDED ‘AS IS.” AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.Copyright 2023 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, [insert all other AMD trademarks used in the material IN ALPHABETICAL ORDER here per AMD's Guidelines on Using Trademark Notice and Attribution] and combinations thereof are trademarks of Advanced Micro Devices, Inc. Microsoft is a registered trademark of Microsoft Corporation in the US and/or other countries. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners.

Rechenzentrum

Business-Systeme

Personal Computing und Gaming

Embedded

Ressourcen

GPU-Beschleuniger

Adaptive Beschleuniger

DPU-Beschleuniger

Ethernet-Adapter

Workstations

Desktops

Notebooks

Ressourcen

FPGAs und adaptive SoCs

Systemmodule (SOMs)

Technologien

Ressourcen für Entwickler

Probeplatinen und Bausätze

Prozessor-Tools

Grafik-Tools und -Apps

Tools für FPGAs und adaptive SoCs

Urheberrechte und Apps

Tools und Apps für GPU-Beschleuniger

Übersicht

Für Rechenzentren und die Cloud

Für Edge und Endpunkte

Für Entwickler

Branchen

Branchen

Branchen

Branchen

Industrias

Einsatzbereiche

Gaming

Systeme

Technologien

Ressourcen

EPYC Prozessoren

Radeon GPUs und AMD Chipsätze

FPGAs und adaptive SoCs

Alveo-Beschleuniger & Kria-SOMs

Ryzen Prozessoren

Ethernet-Adapter

Übersicht

Prozessoren

Beschleuniger

Adaptive SoCs, FPGAs und SOMs

Grafikprodukte

Übersicht

Ressourcen nach Marktsegment

Ressourcen nach Produkt

Ressourcen nach Typ

Über unsere Partner

Weltweiter AMD Support

Prozessoren und Grafikprodukte

Beschleuniger

FPGAs und adaptive SoCs

Gaming und Personal Computing

Adaptive und Embedded Computing

Get AMD Fan Gear

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

Buy Direct From AMD

[How-To] Automatic1111 Stable Diffusion WebUI with DirectML Extension on AMD GPUs

Article By

Related Blogs