Intel® Rendering Framework using Software-Defined Visualization
Why Intel® Xeon® Processors Excel at Visualization
Software-defined visualization (SDVis) is akin to software-defined networking, software-defined infrastructure, and other initiatives Intel is taking to maximize the benefits―and inherent performance―of modern Intel® Xeon® processors with software that takes advantage of high thread count and data parallelism. The performance is there, and the advantages over dedicated devices with limited available memory are manifold―including the ability to use ever-improving advanced algorithms that exploit the:
- Larger memory capacity of the processor
- Flexibility and easy upgradability of software versus hardware replacement
- Overall cost savings during procurement and improved total cost of ownership over the lifespan of the hardware
Performance and Scalability that have Redefined Visualization
Jim Jeffers, senior director and senior principal engineer for Intel’s visualization solutions, notes, “With the Intel® Rendering Framework, all the work is being done on the CPU, while users are getting the same―or better―experience than with today’s dedicated graphics hardware.” The Intel Rendering Framework provides both scalable and interactive ray tracing and OpenGL* visualization via the Intel® Embree, Intel® OSPRay, and Intel® OpenSWR libraries. Plus, the Intel Rendering Framework now includes the new Intel® Open Image Denoise library.
Not surprisingly, modern high-throughput processor cores packaged in multi- and many-core processors can execute many tasks interactively, and with performance unequaled by earlier generations of processors. Jeffers points out that “Benchmarks show a 100x increase in rendering performance compared to what was available in 2016 when rendering OpenGL triangle-based images with Mesa.”*
This level of performance has redefined scientific visualization and is making significant inroads into the cinematic and professional visualization market segments (Figure 1). Jeffers points out that with its ability to exploit the available CPU memory (commonly 192 GB or more for a processor versus 16 GB for a highend GPU), the Intel Rendering Framework can deliver the same or better performance with fidelity that a GPU can’t match. That, coupled with the ability to run and visualize anywhere, regardless of the scale of the visualization task and without requiring specialized hardware for interactive response, is the reason high-performance computing (HPC) centers no longer need to procure GPUs for visualization clusters.
The Primary HPC Visual Analysis Approach for Many
Five years ago, you never would have heard an HPC user say, “I prefer rendering my images on CPUs.” However, that mindset changed as CPU-based interactive and photorealistic rendering supplanted GPUs in many HPC centers. Paul Navrátil, director of visualization at the Texas Advanced Computing Center (TACC), highlights TACC’s commitment by pointing out that “CPU-based SDVis will be our primary visual analysis mode on Frontera*, leveraging the Intel Rendering Framework stack.” Frontera is expected to be the fastest academic supercomputer in the U.S. when it becomes operational in 2019.
In a word, the scalability is “outstanding” as demonstrated by a 1.1 trillion triangle OpenGL hero benchmark by Kitware1 on the Trinity* supercomputer at Los Alamos National Laboratory. However, it doesn’t take a supercomputer to run SDVis. The integration of Intel Rendering Framework components such as OSPRay into Paraview makes exploring the benefits of ray tracing easy on most hardware platforms. David DeMarle, principal engineer at Kitware, notes that with the Intel Rendering Framework, “A one-line change is all that is required for VTK* and ParaView* users to switch between OSPRay ray tracing and OpenGL rendering.”
Traditional Batch and New In Situ and In-Transit Visualization Workflows
The software-defined nature of the Intel Rendering Framework means that scientists can now perform in situ rendering, where visualization occurs using the same nodes as the computation. In situ visualization has been identified as a key technology to enable science at the exascale.2 Jeffers points out, “As we move to exascale, we have to manage exabytes of data. While the data can be computed, the I/O systems aren’t getting there to move the data. Hence, in situ. Otherwise, it can take days, weeks, or months to visualize.” He likes to summarize this by stating, “A picture is worth an exabyte.”
A Path to Exascale Visualization
As part of a U.S. Department of Energy (DOE) multi-institutional effort, and in collaboration with private companies and other national labs, Argonne National Laboratory is working to leverage the SENSEI* framework to help people prepare for the arrival of Aurora*, a new Intel-Cray system. Aurora will be capable of delivering more than an exaflop of floating-point performance. SENSEI is one example of a portable framework that enables in situ, in-transit, and traditional batch visualization workflows for analysis and scalable interactive rendering of the huge data volumes generated when using an exascale supercomputer.
Depending on the application, researchers sometimes may prefer to dedicate more supercomputer nodes to a computationally expensive simulation, while using a smaller number of nodes for rendering. This asymmetric load balancing is called in-transit visualization. Unlike in situ visualization that renders data in place on the node, in-transit visualization does incur some overhead as data must be moved across the communications fabric between nodes. The payoff is the additional compute power that can be dedicated to the simulation. Both in-transit and in situ workflows keep the data in memory and avoid writing to storage. Joseph Insley, Visualization and Analysis Team lead at the Argonne Leadership Computing Facility, points out, “With SENSEI, users can utilize in situ and in-transit techniques to address the widening gap between flop/s and I/O capacity, which is making full-resolution, I/O-intensive post hoc analysis prohibitively expensive, if not impossible.”
Visualization for All, No Special Hardware Required
A big advantage of CPU-based rendering is that no special hardware is required, which means it can be used by nearly everyone on most computational hardware, from laptops and workstations to organizational clusters and leadership-class supercomputers, and even in the cloud.
Interactive photorealistic ray tracing can occur on as few as eight Intel® Xeon® Scalable 8180 processors or scale to big data, high-quality rendering using in situ nodes.3,4,5,6 Jeffers notes that the interactive performance delivered by the Intel Rendering Framework, and photorealistic rendering with the freely available OSPRay library and viewer, “address the need and create the want.” Eliminating the requirement for specialized display hardware means even exabyte simulation data can be “visualized anywhere.” Users appreciate how they can view results on their laptops and switch to display walls or a fully immersive cave.
The ability to run and visualize anywhere using CPUs―regardless of the scale of the visualization task and without requiring specialized hardware for interactive response―is the reason HPC users are now using CPUs for visualization tasks. The integration of the Intel Rendering Framework SDVis capabilities into the popular VisIt*7 and ParaView* viewers, along with frameworks like SENSEI*, gives everyone the ability to perform analysis and use either OpenGL rendering or create up to photoreal images.
From HPC to Professional Rendering Applications
Jeffers observes that one of the key factors driving SDVis adoption is the visual fidelity of the ray tracing. Basically, users get up to photorealism because the software is able to model the physics of light using both serial and parallel processing on the CPU, along with scalable, interactive performance.
The cross-market appeal of the Intel Rendering Framework with SDVis is clear. As Jeffers observes, “There is a real pull from submarkets like CAD and automotive. Photorealism is extremely important in improving ‘virtual’ vehicle design and manufacturing from commercial airplanes to military vehicles. Essentially, decisions can be made about what vehicle to build without ever having to build it. Meanwhile, there is increasing pull from adjunct markets that include offline and interactive rendering for animation and photoreal visual effects.”
It’s All About Separation of Costs
From a software perspective, the Intel Rendering Framework provides the tuned and optimized low-level operations. This is why Jeffers claims it delivers great performance to the applications developer by simply calling the rendering APIs. The scalability to run in distributed environments is also there, which has enabled the big advance in professional rendering to “interactive”8 rendering and ray tracing with full visual effects on huge, complicated data sets. This is why movie studios run on render farms containing thousands of Intel® CPUs.
Jeffers likes to point out the differences between the animation used in the three-time Academy Award* nominated 1989 film The Little Mermaid and the recent Moana image shown below to highlight the improvements enabled by ray tracing using the Intel Rendering Framework. Previously, an overnight rendering workflow would yield a few seconds of video. The 160-billion-object Moana island scene, shown in Figure 3 (recently made publicly available courtesy of Walt Disney Animation Studios to enable research and best industry practices), was rendered live using Intel OSPRay and Intel Embree ray tracing libraries along with the new Intel Open Image Denoise library. System memory capacity was important, since the rendering process consumed more than 100 GB.
Looking to the Future
Jeffers is also excited about the convergence of artificial intelligence (AI) and the ray tracing capabilities of Intel OSPRay and Intel Embree. For example, AI was used to define the believable movement of the robots that were rendered using these libraries in the movie Pacific Rim (Figure 4). Intel Xeon Scalable processors give the Ziva* AI software the performance needed to generate the real-time characters that can progressively learn body movements, while also easily applying features and behaviors from one character to another.9
When asked if photorealist animation will replace actors, Jeffers replied that he thinks humans are necessary to provide the emotional impact a movie demands. However, the technology may advance to the point where voice-overs and actor overlays will become more important as the visual fidelity of state-of-the-art rendering technology continues to improve.
As mentioned, Intel has initiatives aside from the Intel Rendering Framework to exploit the serial and parallel performance of modern many- and multi-core Intel Xeon processors to replace dedicated hardware devices. However, the spectacular images created by the Intel Rendering Framework clearly demonstrate the appeal of CPU-based visualization. The software libraries are open-source and available for download.
Users who simply wish to experience SDVis without doing any development can download the ParaView* or VisIt* applications or the recently announced OSPRay Studio viewer. Meanwhile, HPC developers can use a framework like SENSEI* to exploit in situ and in-transit visualization to run at scale.
Organizations looking to experience the benefit of SDVis can look to Intel Select Solutions for Professional Visualization for verified hardware and software solutions that combine the latest Intel Xeon Scalable processors with other technologies such as Intel® Omni-Path Architecture, Intel® SSDs, and the OpenHPC cluster software stack.
Here’s where application developers can get more information:
While not the point of this article, interested readers can look to other Intel initiatives such as Intel Software Defined Networking and Intel Software Defined Infrastructure to see how Intel Xeon Scalable processors are being used to replace other dedicated pieces of hardware.
Rob Farber is a global technology consultant and author with an extensive background in HPC and in developing machine learning technology that he applies at national labs and commercial organizations.
Rob can be reached at firstname.lastname@example.org.
Special Thanks To …
This article is from The Parallel Universe, Intel’s quarterly magazine that helps you take your software development into the future with the latest tools, tips, and training to expand your expertise. Get it >
1The benchmark only used 1/19th of the machine to render 1.1 trillion triangles. Kitware believes they could have rendered 10-20 trillion triangles per second on the full machine. (http://www.techenablement.com/third-party-use-cases-illustrate-the-success-of-cpu-based-visualization/)
8Interactive does not imply fluid real-time frame rates.