Python Conference APAC 2024

Python in the browser: my journey towards enhancing the Scientific Python ecosystem's interoperability with Pyodide
2024-10-26 , CLASS #2 - 4B
Language: English

The Python programming language in recent years has gained advancements not only in terms of speed but also with its coverage of platforms and architectures that it supports, such as the upcoming iOS (PEP 730) and Android (PEP 738) ABIs, which will be a boon for its popularity and help expand it to a large variety of users. Similarly, the advent of WebAssembly (through the Emscripten toolchain) in the last decade has served a purpose rarely met by any other implementation of low-level code: being able to provide a way to run programs written in multiple languages at near-native speed by reworking higher-level instructions and their intermediate representations to near-assembly ones, that can run inside conventional browsers such as Google Chrome (and other Chromium-based browsers), Safari, and Firefox; across devices.

Here, the web browser has 1. access to the CPU of the host machine and therefore does not hit any major speed or connectivity constraints, and 2. the inherent security models for web browsers and sophisticated sandboxing means that security constraints are seldom a problem.

This talk shall describe Python in WebAssembly through its most popular implementation (Pyodide), and about how I am working with the Scientific Python and PyData ecosystems for scientific computing, artificial intelligence, data science, and more to improve interoperability with Pyodide – and lastly, it will discuss through an interactive yet informational exchange coupled with pragmatic considerations for this Python distribution in specific in terms of interactive documentation and packaging insights, and provide a precursor on how the Scientific Python ecosystem is expected to change in the coming years with further advancements in this area.


Prerequisites and takeaways

  • This talk is intended for those familiar to a reasonable extent with the Python programming language and scientific computing libraries such as NumPy, SciPy, Pandas, and Matplotlib.
  • However, it does not intend to dive into the details behind the implementation of Pyodide and how it uses the Emscripten toolchain, in order to appeal to a wider audience. The majority of the talk will also be accessible to new learners, who should be able to learn about software packaging and web-based interactive documentation. Seasoned Python programmers may still end up learning a few new things related to Pyodide, Emscripten and WASM.
  • Participants should expect to receive a mixture of technical and non-technical content. Therefore, they should be familiar with using a web browser and navigating through documentation-based static websites.

Outline

An outline of the content is provided below, with 25 minutes for the slides and 5 minutes for questions.

Slides 1-5 – An introduction to Python in the browser – 2-5 minutes

  1. A brief intro about Python in the browser, how Python has been historically supported
  2. Notebooks such as Google Colab, Binder, and Deepnote that provide literate programming environments
  3. Interactive programming/pair programming
  4. How Pyodide was compiled in the first place (2018, Michael Droettboom at Mozilla) via the Emscripten toolchain
  5. Necessary limitations of Pyodide:
    5.1. Lack of threading and multiprocessing
    5.2. Lack of modules in the Python stdlib

Slides 6-10 – An introduction to Pyodide – 5 minutes

  1. Pyodide, a Python distribution that works entirely in the browser
  2. Browser limitations
  3. An interactive demo

Slides 11-15 – A shallow dive into Pyodide internals – 5 minutes

  1. How Pyodide compiles CPython
  2. The ramifications of the bitness being only 32 bits (floating point imprecisions, etc.)
  3. How is Pyodide’s ABI defined, plus its compatibility (i.e., which answers the question: why does Pyodide keep to the same Emscripten version for every release?)

Slides 16-25 – Pyodide packaging and interactive documentation – 10-12 minutes

  1. How Python package ecosystems are going to shape up for Pyodide in the coming years, as Pyodide eventually puts out a v1 release, i.e., how is Pyodide useful
  2. Interactive documentation
  3. More examples of how to use a package
    3.1. Tutorials
    3.2. Notebooks
    3.3. Interactive resources, and
    3.4. Better documentation overall
  4. How can package authors make their documentation fully interactive?
  5. JupyterLite kernels and WASM notebooks, the Sphinx documentation engine, and Sphinx extensions
  6. In-tree vs out-of-tree builds
    6.1. Out-of-tree builds enhance the robustness of the package across the public API at runtime and introduce better support, intertwined with interactive API docs
  7. A brief for the Pyodide build system (pyodide-build)
  8. Smaller wheels/binaries for packages because of download size constraints and how this is being prepared across build backends

Agriya Khetarpal is a software engineer at Quansight Labs, where he works on the packaging and distribution of fundamental open-source software in the PyData stack, and on approaches to WASM-powered interactive documentation that are primarily built to serve the purpose of accessibility and education for the Python programming language.

He is interested in Python packaging, scientific computing, numerical software, compilers and toolchains, and many more topics.

Agriya recently graduated from the University of Delhi with a bachelor's degree, majoring in computer science and mathematics.