PyData London 2026

Kamlesh Shah

I am a senior engineering lead/executive director at Morgan Stanley.

I design and build large-scale, enterprise-ready, high-performance financial systems used in production environments where correctness, resilience, and speed matter. My work spans system design, hands-on engineering, and long-term platform evolution in regulated domains.

I place strong emphasis on clean, maintainable architecture—clear domain boundaries, explicit data contracts, and model-driven design. I optimise for systems that remain understandable and adaptable as complexity, scale, and regulatory demands increase.

A significant part of my work focuses on data analytics, complex data modelling, and financial mathematics—including forecasting, liquidity, risk, and regulatory calculations. I enjoy translating mathematically rich problem spaces and large datasets into precise, explainable, and production-grade implementations.

I work with a prototype-to-production mindset, leveraging modern cloud platforms, data tooling, and AI techniques to move quickly while preserving architectural discipline, observability, and operational robustness.

www.linkedin.com/in/kamlesh-shah


Session

06-06
11:05
45min
Columnar Thinking - Designing for high-performance execution with Arrow and Polars
Kamlesh Shah

When building high-performance systems for analytical workload, we often focus on the efficiency of the algorithm, like reducing Big-O complexity or optimising numerical routines. Yet in real world workloads, the decisive factor is not just the algorithm but the shape of how the data is laid out, traversed, and distributed across processes.

This talk will cover aspects of mechanical sympathy, focussing on how structures in memory can benefit from cache-sensitive, SIMD-enabled (vector instructions) CPUs, constrained by memory bandwidth and optimised for predictable, contiguous access.

We will use real-world examples to show how minimising serialisation overhead and enabling efficient cross-process and cross-language data exchange reduces the cost of data movement across systems. Beyond single-system performance, we will examine why Arrow’s standardised, zero-copy columnar format is a critical enabler of distributed execution. We will see how columnar formats support scalable computation across threads, processes, and distributed nodes.

Grand Hall 1