CPU Profiler – AMD uProf CPU profiler follows a statistical sampling-based approach to collect profile data to identify the performance bottlenecks in an application. The profile data collection can be triggered by OS timer, core PMC events, and IBS. AMD uProf offers an intuitive UI to view and analyze the profile data, which helps optimize a wide variety of applications, drivers, game engines, and so on.
User mode Sampling and Tracing:
Provides Hotspots profile type to identify the hottest inclusive and exclusive time-consuming functions supporting C, C++, Java and Python applications, on Linux. It preforms the Callstack stitching of OpenMP applications to provide accurate total inclusive time for OpenMP compute functions. It also provides Mixed mode callstack of Python applications.
Threading Analysis profile types help visualize thread state timelines.
Overview profile type helps visualize heterogenous application (on MI300A) runtime behavior.
OS Tracing – The OS and runtime libraries can be traced along with CPU sampling-based profiles to provide timeline views to analyze what is happening in the system when an application is running. Events can be collected and analyzed: OS scheduling event, System calls, POSIX threads library’s (pthread) thread synchronization APIs, block I/O calls, page fault, and memtrace.
OpenMP, MPI Tracing – Trace analysis can be used to analyze, compute, and load imbalance among the ranks + OpenMP worker threads of HPC workloads.
GPU Profiler – Provides information like performance statistics etc., on GPU hardware components, kernels, and dispatch.
GPU Tracing – Traces the HIP, HSA APIs and GPU activities when a HIP based application is executing.
AMD uProf’ s ANALYZE page: View Function Hotspots at function level
AMD uProf's Top-Down analysis page
AMD uProf’s Overview Analysis
AMD uProf’s timeline across all threads
AMD uProf’s MPI Communication Matrix: Analyze message passing load imbalance