Developers often struggle to understand the trade-offs between compute utilization and memory bottlenecks. A good way to unravel the knot is by using the Roofline model, analysis that offers an intuitive representation of how best to address performance in relation to hardware limitations.
But … interpretation of Roofline results can require a deep knowledge of the application to pinpoint the source of memory bottlenecks.
In this second installation of our 2-part Roofline series (watch Part 1 here), Technical Consulting Engineer Cedric Andreolli discusses this issue in a deeper exploration of Intel® Advisor’s Integrated Roofline Analysis feature, which uses a cache simulator to understand the behavior of each cache level.
Watch now to:
- Get an overview of the Integrated Roofline Analysis feature, including how it provides in-depth information about the memory hierarchy (L1, L2, L3, or DRAM), identifying the specific location of bottlenecks
- Identify how much more memory bandwidth each hierarchy level is achieving versus the maximum potential at each level
- Explore a real-world example of the optimization process
Get the software
- Download Intel® Advisor as part of the Intel® oneAPI Base Toolkit, a core set of tools and libraries for creating data-centric, cross-architecture applications.
- Sign up for an Intel® DevCloud for oneAPI account—a free development sandbox with access to the latest Intel® hardware and oneAPI software.