Intel oneAPI News Updates
Release beta08 of Intel® oneAPI Products Now Live
Highlights include the introduction of Intel® Distribution of Modin and OmniSci for distributed (and accelerated) data analytics preprocessing, up to 4x improved rendering speed and particle volume support in the Intel® oneAPI Rendering Toolkit, introduction of Performance Snapshot profiling in Intel® VTune™ Profiler for quick initial analysis, memory-level roofline analysis in Intel® Advisor, H.265 and AV1 CPU software codecs in Intel® oneAPI Video Processing Library, and NUMA optimization capabilities in Intel® oneAPI Threading Building Blocks.
- Major Intel® oneAPI Video Processing Library (oneVPL) update, including H.265 & AV1 CPU software codecs and upward compatibility with Intel® Media SDK.
- Major Intel® oneAPI Threading Building Blocks (oneTBB) update including detailed NUMA affinity management capabilities and alignment with modern C++.
- Improved Intel® oneAPI DPC++ Compiler code performance for CPU architectures.
- Intel VTune Profiler continues to refine analysis for GPU accelerators with the addition of OpenMP offload pragma-aware metrics. It also adds a Performance Snapshot as a first profiling step to suggest the detailed analyses (memory, threading, etc.) that offers the most optimization opportunity.
- Intel® Advisor adds memory-level Roofline analysis that helps pinpoint exact memory hierarchy bottlenecks (L1, L2, L3 or DRAM).
- Initial OpenMP 5.0 GPU offload support in the Intel® C++ Compiler.
- Intel® AI Analytics Toolkit adds significant enhancements to data analytics workflows by introducing Intel® Distribution of Modin, released through the Anaconda channel. Seamlessly scale data preprocessing across multiple nodes using this intelligent, distributed dataframe library with an identical API to Pandas. In the backend, it is supported by OmniSci, a performant framework for end-to-end analytics that has been optimized to harness the computing power of existing and emerging Intel® hardware.
- Intel AI Analytics Toolkit also upgrades to PyTorch 1.5, which includes support for Bfloat16 data type and the latest 3rd Gen Intel® Xeon® Scalable Processors (codenamed Cooper Lake).
- The Intel® Distribution for Python introduces GPU support for Python/Numba code on Linux and the Python Data Parallel Processing Library (PyDPPL), a lightweight Python wrapper for DPC++ and SYCL that provides a data parallel interface and abstractions to efficiently tap into device management features of Intel® CPUs and GPUs.
- Intel® OSPRay and Intel® Open Volume Kernel both add support for particle volumes, while Intel OSPRay also adds support for Stereo 3D mode for panoramic camera and scalability of light sources.
- Performance improvements in Intel Open Volume Kernel and Intel® Open Image Denoise improved rendering speeds by up to 4x and improved image quality.
- Photon mapping support added to Intel® Embree.
- Intel Open Volume Kernel also adds support for configurable filter/reconstruction methods, stream-wide sampling and gradient API, Iterator allocation API, and strided data arrays.
- Intel Open Image Denoise improved image quality by adding additional Feature Buffers. It also includes new XTraining Code features and improvements.
- macOS CPU support introduced for the Intel® oneAPI Rendering Toolkit as well as C++ and Fortran compilers, most of the libraries in the Intel® oneAPI Base Toolkit, and analysis tool results viewers.
- New support for Singularity containers.
Meet Huddl.ai, the Future of Remote Collaboration
Slack meets Zoom meets Google Drive … in this new video collaboration platform, backed by prominent investors including former San Francisco 49ers player Ronnie Lott, and powered by Intel hardware and software technologies. Poised to transform remote collaboration, this solution automatically manages all meeting content via a real-time collaborative notes application, an automatic speech recognition function that turns speech into searchable text, and a recommendation engine that suggests meeting agendas based on participant.
“The future of remote collaboration is more than just audio-video. It’s about solutions that leverage deep learning and advanced media capabilities to help people be more productive at solving problems regardless of their location. Huddl.ai achieves high scalability and an innovative AI-based meeting experience through deployment of its cloud data platform on Intel® Xeon® processor-based servers, and integration of Intel’s Open WebRTC Toolkit for video conferencing and Intel® Distribution of OpenVINO™ toolkit, powered by oneAPI, for face and text detection.” — Jeff McVeigh, Vice President, Datacenter XPU Products and Solutions, Intel
“Huddl.ai is, simply put, the virtual meeting platform of the 21st century. While using Huddl.ai at Nutanix, we’ve experienced better meetings overall, but more specifically we’ve seen greater follow-up on things discussed in meetings and more efficient use of our scheduled meeting times.” — Wendy M. Pfeiffer, CIO, Nutanix
Collaboration between Bentley and Intel Leads the Future for Digital Craftsmanship
As the new Bentley Bentayga joins the portfolio of luxury cars on Bentley’s online car configurator, this configurator establishes a bar for the future of digital craftsmanship. It uses real-time, highly accurate visualization made possible through the advanced ray tracing capabilities of Intel® OSPRay, a component of the Intel® oneAPI Rendering Toolkit, to render over 1.7 million images, delivering seemingly limitless options to customers for the full Bentley model range.
The configurator now renders these images faster than ever before, with a 33% improvement in finding errors despite a 600% increase of content, powered by Intel® architecture, integration of artificial intelligence (AI) into OSPRay, and data and feedback from Bentley. The Bentley configurator aims to inspire customers in their auto purchase experiences.
Celebrating Women Innovators: Two Trailblazers Who Are Advancing Technology
Technology is designed for diverse individuals with unique needs—it only makes sense that it is best built by diverse communities, whether diversity in gender, race, culture, religion, sexual orientation, thought and more. So many women have advanced science, technology and other fields of innovation, yet when Forbes released its 2020 list of 100 Most Innovative People in Business, there was only one woman on that list. Here, we celebrate the stories of women who are using the DevCloud for oneAPI to further innovation. One project is funded by Spanish Government and is published by Springer, and the other in its infancy—and these are only a start. Both hold the promise and spirit of innovation that captivates and uplifts us to reach for more.
Intel Contributes Advanced oneAPI DPC++ Capabilities to the SYCL 2020 Provisional Spec
Today, The Khronos Group, an open consortium of industry-leading companies creating graphics and compute interoperability standards, announced its SYCL 2020 Provisional Specification, for which Intel has made significant contributions through new programming abstractions. These new capabilities accelerate heterogeneous parallel programming for HPC, machine learning, and compute-intensive applications.
“The SYCL 2020 Provisional Specification marks a significant milestone helping improve time-to-performance in programming heterogenous computing systems through more productive and familiar C++ programming constructs,” said Jeff McVeigh, Intel vice president of Datacenter XPU Products and Solutions. “Through active collaboration with The Khronos Group, the new specification includes significant features pioneered in oneAPI’s Data Parallel C++, such as unified shared memory, group algorithms and sub-groups that were up-streamed to SYCL 2020. Moving forward, Intel’s oneaAPI toolkits, which include the SYCL-based Intel® oneAPI DPC++ Compiler, will deliver productivity and performance for open, cross-architecture programming.”
See these references for more details.
- The Khronos Group: SYCL 2020 Provision Specification press release
- Intel PR Partner article: Intel Contributes Advanced oneAPI DPC++ Capabilities to the SYCL 2020 Provisional Spec
- InsideHPC: New, Open DPC++ Extensions Complement SYCL and C++, which notes how Intel is advancing industry standards
- oneAPI Code Together Podcast: Collaborating to Build a Heterogeneous Future – An interview with Ronan Keryell of Xilinx and Jeff Hammond at Intel that explains the value of open languages and programming models, diving into ISO C++, what excites them most about the SYCL 2020 Provisional Specification, and more [20 min]
TensorFlow on oneAPI Industry Innovation
With the growth of AI, machine learning, and data-centric applications, the industry needs a programming model that allows developers to take advantage of rapid innovation in processor architectures. TensorFlow supports the oneAPI industry initiative and its standards-based open specification. oneAPI complements TensorFlow’s modular design and provides increased choice of hardware vendor and processor architecture, and faster support of next-generation accelerators. TensorFlow uses oneAPI today on Xeon processors and we look forward to using oneAPI to run on future Intel architectures.
SeRC & Intel form first oneAPI Academic Center-of-Excellence
The Swedish e-Science Research Center (SeRC) announced that it has extended its support of the oneAPI initiative as Intel’s first oneAPI academic center of excellence (COE). Hosted at Stockholm University and the KTH Royal Institute of Technology, the center will use oneAPI’s unified and heterogeneous programming model to accelerate research conducted with GROMACS, a widely-used free and open-source application designed for molecular dynamics simulation.
Release Beta07 of Intel® oneAPI Products Now Live
Highlights include significant new capabilities in the Intel® AI Analytics Toolkit (Model Zoo, GPU support for DBSCAN and SVM algorithms, CPU optimizations for scikit-learn algorithms); improved DPC++ compiler performance, language definition and Intel DPC++ Compatibility Tool; enhancements to Intel® VTune™ Profiler and Intel® MPI Library; and new Intel® System Debugger capabilities.
- New Model Zoo in the Intel® AI Analytics Toolkit, including pretrained models and sample scripts for many popular open source deep learning topologies optimized for Intel architectures.
- Incorporates GPU support for DBSCAN and SVM algorithms in Intel® AI Analytics Toolkit, along with many CPU optimizations for scikit-learn algorithms. Includes scikit-ipp 1.0.0, a drop-in replacement for scikit-image package to accelerate image processing functions, as well as XGBoost 1.1 release with the latest histogram tree grow method optimized for Intel CPUs for faster training.
- Improved DPC++ compiler performance for CPU platforms.
- Simplified, modernized DPC++ language definition through use of newer standard C++ language features.
- Intel® VTune™ Profiler now supports the latest Intel GPUs—Gen9 and Gen11 integrated graphics, and pre-released Gen12 integrated and discrete GPUs—and incorporates an improved GPU Memory Hierarchy diagram annotated with GPU hardware performance metrics.
- Intel® MPI Library introduces initial GPU pinning support for Intel Xe architecture devices and expanded support for Mellanox ConnectX.
- Improved migration of CUDA math, texture, and parallelism library calls in the Intel DPC++ Compatibility Tool.
- Intel® System Debugger now provides a new auto-detection mechanism in the target connection assistant that helps quickly establish a system debug connection to a target platform. Enhanced system TraceCLI configuration support also allows developers to easily set-up this interface in both interactive and scripting modes, and a system debug sample enables developers to easily explore and learn to use the system debug capabilities.
Try oneAPI today. Beta07 is available now in the oneAPI DevCloud and via web download, containers, and repositories.
Software Innovators Contribute to the COVID-19 Response
The coronavirus (COVID-19) global pandemic has united our communities, even as we adhere to shelter-in-place stipulations. Never have science and technology been more important in helping us navigate these extraordinarily challenging times to emerge stronger. Here, we highlight a few of the Intel software innovators who are pursuing worthwhile medical research projects to contribute to this critical effort. Their projects span objectives—from speeding detection and uncovering new treatments to slowing the spread of the virus—and we are honored to support them in their endeavors. We hope all of our colleagues and communities around the world are keeping healthy and staying safe.
New Study Finds oneAPI Programming Model Saves Time and Money
A new research report from J.Gold Associates, “oneAPI: Software Abstraction for a Heterogeneous Computing World”, details the enterprise and developer benefits of transitioning to oneAPI.
Key Takeaway: Moving to a cross-architecture model for application development can save an organization significant time and money—over 5 months and $300,000 each time a performance-sensitive application is moved to a new computing platform. Read the report now.
Release beta06 of Intel® oneAPI products is now live.
Highlights include support for the Intel® Stratix® 10 FPGA family, extensive new data science capabilities (Intel® Scalable Dataframe Compiler for high-performance Pandas), major deep-learning framework improvements (bfloat16 datatype support in TensorFlow and the addition of torchvision to PyTorch for higher performance), and new rendering capabilities (support of VDB volumes, new geometries, new light sources, and the option to use pre-trained models and retrain filter models for denoising).
- Select Intel® oneAPI tools now support the Intel Stratix 10 FPGA family via the Intel® FPGA Programmable Acceleration Card D5005. (Note, this is in addition to current support of the Intel® Arria® 10 family.) Supported tools are:
- Intel® oneAPI DPC++ Compiler
- Intel® oneAPI DPC++ Library
- Intel® Advisor
- Intel® VTune™ Profiler
- New DPC++ CPU and GPU function support in Intel® oneAPI Math Kernel Library (oneMKL) for BLAS, LAPACK, RNG, and FFT functions.
- New DPC++ code samples added and others improved, including a new Mandelbrot visualization sample.
- Many improvements to the Intel® DPC++ Compatibility Tool, including improved CUDA code migration coverage for memory management and USM-enabled cuRAND API and DPC++ output code conciseness.
- New data science capabilities including:
- Intel Scalable Dataframe Compiler in the Intel® Distribution for Python for high-performance Pandas on CPUs
- uint8 support in XGBoost for reduced memory footprint
- optimized implementations of random forest, AdaBoost, and gradient boosting classifiers in Scikit-learn for high-performance ensemble learning
- Deep learning framework improvements including:
- New rendering capabilities of Intel® oneAPI Rendering Toolkit, including:
- Intel® Open Volume Kernel Library now supports VDB volumes and volume observers
- Intel® OSPRay now enables easy rendering of clipping geometries, plane geometries, and new light sources for creating natural sun light and photometric indoor lighting
- Intel® Embree now includes round, linear curves featuring a new curve primitive for rendering hair quickly
- Intel® Open Image Denoise adds the option to use pre-trained models and retrain filter models with user-defined datasets to improve image quality for specific renderers and content
- Intel® System Debugger now supports Python 3 to run modern debug scripts. It also provides a new intuitive system debug interface for Intel® Processor Trace. The Intel® Debug Extensions for WinDbg now support Windows Core OS and efficient ACPI Machine Language debug.
Try oneAPI today. Beta06 is available now in the oneAPI DevCloud and via web download, containers, and repositories.
New Podcast Series Explores the Cross-Architecture Journey
The emergence of machine learning, artificial intelligence, computer vision, and other compute-intensive workloads—and subsequent race to simplify cross-architecture development—has driven an exciting evolution in technologies across our software ecosystem.
A new podcast series, Code Together, will explore the challenges and possibilities of cross-architecture development through bi-weekly discussions with those at the forefront who are charting a course through this increasingly diverse, data-centric world. The series explores various aspects of this software ecosystem, from languages, compilers and libraries to other software development tools.
Read the blog to find out more.
Codeplay Brings NVIDIA GPU Support to Industry-Standard Math Library
April 20, 2020 | oneMKL
Codeplay has made another significant contribution to enabling an open standard, cross-architecture interface for developers as part of the oneAPI industry initiative. The latest contribution implements the commonly used cuBLAS library for NVIDIA GPUs, using the open standard SYCL and DPC++ implementation. This implementation forms part of the oneAPI Math Kernel Library (oneMLK) and is optimized to bring native performance to developers who use NVIDIA GPUs. Codeplay Developer Relations Manager Rod Burns shares the details.
Intel Open Sources the oneAPI Math Kernel Library (oneMKL) Interface
To address the lack of an industry-standard interface for math libraries and provide a single, cross-architecture API for CPUs and accelerators, Intel released the oneAPI Math Kernel Library (oneMKL) open source interface. The oneMKL specification lets developers efficiently code portable, math-intensive applications that run across multiple vendors’ architectures. The oneMKL APIs can be combined with math libraries that target a range of CPU hardware and other hardware accelerator architectures, providing a path to support for NVIDIA and AMD libraries in addition to Intel CPUs, GPUs and other accelerators. Get the details.
oneAPI v0.7 Specification Released
The oneAPI specification v0.7 has been released, which includes several enhancements to DPC++ including 10 new language extensions and updates to many of its libraries. Read the details from Sanjiv Shah, Developer Software Engineering Manager.
Developers Innovate with oneAPI
Several innovative developer projects are using the oneAPI cross-architecture programming framework, along with Intel® oneAPI Toolkits(Beta) – from scalable molecular dynamics, predicting corn/wheat/soybean yields, denoising graphics and high-fidelity rendering, and more. See them on Intel® DevMesh.
Release 05 of Intel® oneAPI beta products is now live. Here are the highlights:
- Extensive compiler improvements for mixed-language development, including DPC++/OpenMP* composability, additional OpenMP 5.0 and Fortran constructs, and increased runtime performance
- New support for Microsoft Visual Studio Code* (VS Code) includes code samples browser and profiling tool plugins to speed code development
- New and enhanced functions for Intel® oneAPI libraries, including matrix multiplication, machine learning, and codecs for CPU and GPU platforms
- Additional GPU performance metrics and easier workflow for FPGA performance analysis in Intel® VTune™ Profiler
- New and enhanced code samples, including 2D finite difference wave equation solution, Mandelbrot, and matrix multiplication
- New CentOS* container distribution
We encourage you to try oneAPI today. Beta05 is available now in the oneAPI DevCloud and via web download, containers, and repositories.
Intel® DevCloud Now Supporting JupyterLab*
Intel® DevCloud for oneAPI now supports the web-based JupyterLab* development environment to deliver a “modern” experience for Python*, AI, and other developers. They can use this platform to put instructions into code and do data analysis, data visualization, and interactive exploratory computing. Read more.
oneAPI Specification: Intel Compute Runtime Adds oneAPI Level Zero Support
March 9, 2020 | oneAPI Specification
“Meet Level Zero API and NEO driver running on Intel Gen9, Gen11, and Gen12 hardware first on Linux*, but it will not stop there” — Gregory Stoner, GPU Computing Solutions, Intel Corporation
The open source implementation of Level Zero, the low-level API specification for oneAPI, was just released, making it easier for accelerator vendors to leverage oneAPI for their devices. Looking forward to community feedback here.
The Intel® oneAPI 2021.1 Beta04 release is now available. Updates include:
- Enhancements to improve developer productivity for Intel CPU-GPU systems including improved DPC++ language conciseness for easier code comprehension and maintainability, broader support for Unified Shared Memory programming, additional GPU function support in oneAPI libraries, and improved analysis capabilities in Intel® Advisor and Intel® VTune Profiler.
- Improved performance, functionality, and stability across the Intel® oneAPI toolkits.
- Intel® oneAPI Rendering Toolkit introduces Intel® Open Volume Kernel Library for greatly enhanced volume sampling and rendering features.
We encourage you to try oneAPI. To get started, go to https://software.intel.com/en-us/oneapi. Beta04 is available in the oneAPI DevCloud and via web download, containers, and repositories.
Codeplay Brings SYCL*, Data Parallel C++ to Nvidia GPUs
“I promised at SC19 that we would open source SYCL fr Nvidia GPUs using Intel’s DPC++ SYCL compiler … and here it is. It’s a work-in-progress but being actively developed.” — Andrew Richards, CEO, Codeplay Software
Codeplay Software has announced the first release of its DPC++ compiler for Nvidia GPUs. This announcement marks a major milestone on the road to a single, cross-architecture, cross-vendor accelerator programming model. Read more.