WTH is Pandas?
Well, they’re super cute tuxedo-colored bears from China who have six toes and eat bamboo.
Also … Pandas is dataframe library that allows to you perform data manipulation in Python, and it’s great for heterogeneous data—think data science—because it provides easy-to-use APIs to manipulate and process dataframes.
(You knew we’d get there.)
But there’s an issue: When working with excessively large amounts of data or when needing high-performance, single-core Pandas becomes a bottleneck for a data practitioner’s workflow. As a result, adopting a Pandas-workflow-compatible distributed system/solution is often needed.
This webinar discusses precisely that: Intel® Distribution of Modin*.
Join software engineer Areg Melik-Adamyan for a tour of this Distribution, including:
- An overview Modin, including its OmniSci (accelerated analytics) backend
- How to get the best performance and scaling through Intel Distribution of Modin
- How to efficiently run end-to-end machine learning workloads without any code changes
Get the software
- Download the Intel® Distribution of Modin* as part of the Intel® AI Analytics Toolkit. Powered by oneAPI, the AI Kit includes 6 dev tools for accelerating data science and AI pipelines.
- Sign up for an Intel® DevCloud for oneAPI account—a free development sandbox with access to the latest Intel® hardware and oneAPI software, including the AI Toolkit.
Other resources
- OmniSci and Intel Collaborate to Bring Accelerated Analytics at Scale to CPUs
- Read the latest Intel AI Analytics blogs on Medium.
- Subscribe to the POD—Code Together is an interview series that explores the challenges at the forefront of cross-architecture development. Each bi-weekly episode features industry VIPs who are blazing new trails through today’s data-centric world. Available wherever you get your podcasts.