Profiling Heterogeneous Computing Performance with Intel® VTune™ Profiler

Programming of heterogeneous platforms requires deep understanding of system architecture on all levels, which help applications design to leveraging the best data and work decomposition between CPU and an accelerating hardware like GPUs. However, in many cases the applications are being converted form a conventional CPU programming language like C++, or from accelerator friendly but still low level languages like OpenCL, and the main problem is to determine which part of the application is leveraging from being offloaded to GPU. Another problem is to estimate, how much performance increase one might gain due to the accelerating in the particular GP GPU device. Each platform has its unique limitations that are affecting performance of offloaded computing tasks, e.g. data transfer tax, task initialization overhead, memory latency and bandwidth limitations. In order to take into account those constraints, software developers need tooling for collecting right information and producing recommendations to make the best design and optimization decisions.

In this presentation we introduce two new GPU performance analysis types in Intel® VTune™ Profiler, and a methodology of heterogeneous applications performance profiling supported by the analyses. VTune Profiler is a well-known tool for performance characterization on CPUs, now it includes GPU Offload Analysis and GPU Hotspots Analysis of applications written on most offloading models with OpenCL, SYCL/Data Parallel C++, and OpenMP Offload.

Vladimyr Tsmbal, Senior Technical Consulting Engineer, Intel Corporation

Vladimir Tsymbal is a senior technical consulting engineer who specializes in teaching customers how to use a variety of Intel® Software Tools to develop, tune, and optimize their parallel applications on Intel® Architecture. In particular, his focus is on Intel® Parallel Studio XE product suite and the analysis tools it contains, including Intel® VTune™ Profiler (which he helped develop), Intel® Advisor, and Intel® Inspector.

Prior to joining Intel in 2005, Vladimir worked as a research assistant, and developed hardware graphics accelerators and software and hardware systems for medical diagnostics. He holds a PhD in Mathematics and Computer Science from Taganrog State University of Radio Engineering, Russia.

Performance varies by use, configuration, and other factors. Learn more at