The trend for today’s CPUs is core count … and lots of it. (Cases in point: 2nd Gen Intel® Xeon® Scalable processors scale up to 48 cores per CPU. And Intel® Xeon Phi™ processors have as many as 72!) In this environment, vectorizing your code is critical to delivering optimal application performance on core-rich nodes.
So how do you write vectorization-friendly code?
You start by identifying and removing barriers like those affecting memory access patterns and cache usage, and balancing multi-process programming (MPI) with multi-threaded programming (OpenMP).
Watch Ian Wang, HPC specialist from University of Texas, discuss these concepts, including:
- The basics of vector-aware programming, dependency analysis, and optimization reports
- Guidance in using vector units, the proper placement of tasks/threads, the efficient use of memory bandwidth, and the impact of frequency scaling
- Software tools of the trade, including Intel® Math Kernel Library and Intel® C++ Compilers
- Code samples and step-by-step instructions
Download the software