Using Low-Precision Optimizations for High-Performance DL Inference Applications

With advances in hardware acceleration and support for low precision, Deep Learning inference delivers higher throughput and lower latency. However, Data Scientists and AI developers often need to make a trade-off between accuracy and performance. There are also the deployment challenges due to high computational complexity of inference quantization. This webinar talks about the techniques and strategies such as automatic accuracy-driven tuning for post-training quantization and quantization aware training to overcome these challenges.

Join us to learn about Intel’s new low precision optimization tool and how it helped CERN OpenLab to reduce inference time while maintaining the same level of accuracy on convolutional Generative Adversarial Networks (GAN). The webinar will give insights about how to handle strict precision constraints that are inevitable while applying low precision computing to generative models.

Sofia Vallecorsa, AI and Quantum Researcher at CERN openlab

Sofia Vallecorsa is an accomplished physicist who specializes in scientific computing with commanding expertise in ML/DL architectures, frameworks, and methods for distributed training and hyper-parameters optimization. Joining CERN in 2015, she is responsible for several projects in MLDL, quantum computing and quantum machine learning, and also supervises masters and doctoral thesis students in these same fields. Sofia holds a PhD in High Energy Physics from University of Geneva.

Feng Tian, Machine Learning Engineer, Intel Corporation

Feng is a senior deep learning engineer in machine learning performance team with IAGS(Intel Architecture, Graphic and Software) group. He leads the development of Intel Low Precision Optimization Tool and contributes on intel optimized deep learning frameworks, such as TensorFlow and PyTorch. He has 14 years of experience working on software optimization and low level driver development on IA platform.

Performance varies by use, configuration, and other factors. Learn more at