Linux memory management at scale
2020-10-11 , Arch Conf

Memory management is an extraordinarily complex and widely misunderstood topic. It is also one of the most fundamental concepts to understand in order to produce coherent, stable, and efficient systems and containers, especially at scale. In this talk, we will go over how to compose reliable memory heavy, multi container systems that can withstand production incidents, and go over examples of how Facebook is achieving this in production at the cutting edge. We'll also go over the open-source technologies we're building to make this work at scale in a density that has never been achieved before.

We'll also discuss widely-misunderstood Linux memory management concepts which are important to site reliability and container management with an engineer who works on the Linux kernel's memory subsystem, busting commonly held misconceptions about things like swap and memory constraints, and giving advice on key and bleeding-edge kernel concepts like PSI, cgroup v2, memory protection, and other important container-related topics along the way.

See also: Slides (3.2 MB)

Chris Down is an engineer on Facebook's Kernel team, based in London. He works on memory management within the kernel, especially cgroups, and is also a maintainer of the systemd project. Inside Facebook, he is responsible for debugging and resolving major production issues and improving the reliability and efficiency of Facebook's systems at scale.