PyCon DE & PyData 2026

Zero-Copy or Zero-Speed? The hidden overhead of PySpark, Arrow & SynapseML for inference
, Dynamicum [Ground Floor]

"Zero-copy" data transfer promises free communication between Spark's JVM and Python workers, but at 6 billion rows daily, the reality is far more complex. This session explores the low-level mechanics of distributed inference, focusing on the serialization bottlenecks.

We will conduct an analysis of execution plans generated by pandas_udf, mapInPandas, and SynapseML. We visualize the true cost of pickling, Arrow record batching, and JNI context switching. Join this deep dive to understand the physics of distributed inference and learn how to tune spark.sql.execution.arrow.maxRecordsPerBatch to prevent OOMs without starving the CPU.


This talk is a technical deep dive into the "physics" of distributed machine learning inference. While high-level APIs promise seamless integration between Spark (JVM) and Python, the underlying data transfer mechanisms often become the primary bottleneck for high-throughput systems. We start by reality-checking the "Zero-Copy" promise of Apache Arrow in a PySpark context, identifying exactly where the abstraction leaks and where "Zero-Copy" isn't actually free.

The session concludes with a focus on tuning for throughput. We will explore the delicate balance of configuring spark.sql.execution.arrow.maxRecordsPerBatch, demonstrating how to find the "Goldilocks" zone that maximizes CPU saturation without causing JVM off-heap memory crashes. Attendees will gain a deep understanding of the memory hierarchy involved in distributed inference and practical strategies for profiling serialization overhead in production.

Key Takeaways:

  • Internals knowledge: Understand exactly how data moves from JVM heap to Python worker memory.
  • Which method to use depending on your use-case
  • Tuning skills: Learn how to configure Apache Arrow batch sizes to optimize CPU saturation.

Expected audience expertise in your talk's domain:: Intermediate Expected audience expertise in Python:: Advanced
See also: presentation (4.1 MB)

Currently solving the MLOps puzzle at Zalando, ensuring our pricing recommendation algorithms are as streamlined as my swimming technique. I spend my days shaping ML standards at scale and my free time training in the real world.

My life in two modes:

  1. Running pipelines & Diving deep into infrastructure.

  2. Running trails & Diving into the ocean.

Always looking to optimize the former to make more time for the latter."