Domain-Specific AI at Scale: Open Models, Post-Training, and AI Infrastructure
Abstract
Learn how domain-specific AI moves beyond generic models using post-training, domain evals, and scalable open infrastructure. Using Open Telco Models as a case study, this session covers curated data, reward loops, unified training and serving, and AMD Instinct/ROCm-based stacks for building specialized AI systems at enterprise scale.
Related Sessions
-
ElMerFold: Exascale AI for Protein Structure Prediction with El Capitan
ElMerFold: Exascale AI for Protein Structure Prediction with El Capitan
DProtein structure prediction is foundational to modern biology, enabling breakthroughs in drug discovery, enzyme engineering, and AI-driven science. We present ElMerFold, a production-scale synthetic data generation workflow running on the El Capitan system at 11,000 nodes and 44,000 APUs. ElMerFold processes ~41 million proteins at 2,378 structures/s, achieving a 16.3× improvement over prior approaches and reaching 969 PFLOP/s FP32 inference performance.;DProtein structure prediction is foundational to modern biology, enabling breakthroughs in drug discovery, enzyme engineering, and AI-driven science. We present ElMerFold, a production-scale synthetic data generation workflow running on the El Capitan system at 11,000 nodes and 44,000 APUs. ElMerFold processes ~41 million proteins at 2,378 structures/s, achieving a 16.3× improvement over prior approaches and reaching 969 PFLOP/s FP32 inference performance.
July 23, 2026
-
Inside AMD IT: Our Enterprise AI Journey
Inside AMD IT: Our Enterprise AI Journey
How is AMD using AI to transform its own enterprise IT? Go inside AMD IT's AI journey – from early experimentation to scaled deployment. Learn about AMD IT’s multi-year AI strategy and its strategic investment in Data platform to enable AI to automate mega process workflows, accelerate delivery, and drive measurable business outcomes. Whether you are starting your AI journey or scaling existing initiatives, this insider view offers practical lessons on strategy, adoption, and change.;How is AMD using AI to transform its own enterprise IT? Go inside AMD IT's AI journey – from early experimentation to scaled deployment. Learn about AMD IT’s multi-year AI strategy and its strategic investment in Data platform to enable AI to automate mega process workflows, accelerate delivery, and drive measurable business outcomes. Whether you are starting your AI journey or scaling existing initiatives, this insider view offers practical lessons on strategy, adoption, and change.
July 23, 2026
-
Build an MRI Analysis Agent with AMD Blueprints
Build an MRI Analysis Agent with AMD Blueprints
Build and deploy an AI-powered MRI analysis agent in minutes using the AMD mri-doc Solution Blueprint. Run a Gradio-based pipeline that accepts DICOM, NIfTI, and standard image formats, applies tissue segmentation and anomaly detection, and generates LLM-drafted clinical reports on AMD Instinct GPUs. Then customize: swap the LLM AIM, reuse an existing model endpoint, or extend the pipeline for your specific clinical workflow.;Build and deploy an AI-powered MRI analysis agent in minutes using the AMD mri-doc Solution Blueprint. Run a Gradio-based pipeline that accepts DICOM, NIfTI, and standard image formats, applies tissue segmentation and anomaly detection, and generates LLM-drafted clinical reports on AMD Instinct GPUs. Then customize: swap the LLM AIM, reuse an existing model endpoint, or extend the pipeline for your specific clinical workflow.
July 23, 2026
-
How to Right-size Your Memory
How to Right-size Your Memory
Memory has rarely been in such short supply and is impeding customer data center refresh plans. In this interactive conversation, we’ll discuss tips and tools for right-sizing memory configurations to help move your data center efficiency initiatives forward and preserve ROI. Bring your questions and our experts will provide answers!;Memory has rarely been in such short supply and is impeding customer data center refresh plans. In this interactive conversation, we’ll discuss tips and tools for right-sizing memory configurations to help move your data center efficiency initiatives forward and preserve ROI. Bring your questions and our experts will provide answers!
July 23, 2026
-
Training at Scale with AMD Primus
Training at Scale with AMD Primus
Primus makes large scale training on Instinct reliable, debuggable and highly performant. It supports the latest OSS training frameworks, models, and is expanding support to new, cutting-edge model architectures, training techniques, and datatypes. Primus’ SOTA pre and post training performance, proven at scales of thousands of GPUs, positions instinct as a competitive solution for model development at frontier labs, enterprises and AI startups.;Primus makes large scale training on Instinct reliable, debuggable and highly performant. It supports the latest OSS training frameworks, models, and is expanding support to new, cutting-edge model architectures, training techniques, and datatypes. Primus’ SOTA pre and post training performance, proven at scales of thousands of GPUs, positions instinct as a competitive solution for model development at frontier labs, enterprises and AI startups.
July 23, 2026
-
Creating CPU Inference and Agentic Performance Transparency
Creating CPU Inference and Agentic Performance Transparency
Your finance team doesn't care about tokens per second. They care about predictable costs, compliance risk, and vendor lock-in. With agentic AI, the metrics for tracking success are even more complex. But benchmarks don't answer the question that actually matters: Should you undertake this effort and is it viable for your business? In this interactive technical discussion, we’ll break down the tradeoffs, work through the math, and pressure-test the strategy together.;Your finance team doesn't care about tokens per second. They care about predictable costs, compliance risk, and vendor lock-in. With agentic AI, the metrics for tracking success are even more complex. But benchmarks don't answer the question that actually matters: Should you undertake this effort and is it viable for your business? In this interactive technical discussion, we’ll break down the tradeoffs, work through the math, and pressure-test the strategy together.
July 23, 2026
-
Benchmarking AI Systems: from Model Metrics to Real-World Performance
Benchmarking AI Systems: from Model Metrics to Real-World Performance
The agentic AI stack has evolved to fast multi-model orchestration, tool-augmented reasoning, and long-running inference chains. The hardware conversation hasn't kept up, and many teams default to one GPU vendor without evaluating alternatives. This interactive session is for builders to learn what they're missing. We'll review head-to-head benchmark data from third-party testing, discuss production-ready serving stacks on ROCm, and break down TCO for teams running multi-step agents at scale.;The agentic AI stack has evolved to fast multi-model orchestration, tool-augmented reasoning, and long-running inference chains. The hardware conversation hasn't kept up, and many teams default to one GPU vendor without evaluating alternatives. This interactive session is for builders to learn what they're missing. We'll review head-to-head benchmark data from third-party testing, discuss production-ready serving stacks on ROCm, and break down TCO for teams running multi-step agents at scale.
July 23, 2026
-
Agentic Kernel Performance Tuning with AMD ROCm
Agentic Kernel Performance Tuning with AMD ROCm
This session introduces an agentic kernel development workflow for optimizing AI and HPC workloads on AMD ROCm. Learn how a self-directing optimization loop can profile, analyze, optimize, validate, and generate production-ready kernel improvements with minimal manual tuning. The talk highlights how AMD is accelerating kernel engineering by reducing weeks of performance optimization effort into an automated, scalable workflow for developers and performance engineers.;This session introduces an agentic kernel development workflow for optimizing AI and HPC workloads on AMD ROCm. Learn how a self-directing optimization loop can profile, analyze, optimize, validate, and generate production-ready kernel improvements with minimal manual tuning. The talk highlights how AMD is accelerating kernel engineering by reducing weeks of performance optimization effort into an automated, scalable workflow for developers and performance engineers.
July 23, 2026