2024-08-16 –, Conference Auditorium (capacity 260)
Not only are AI/ML models resource hungry at runtime, they also tend to take up a lot of storage space. This may not be obvious, as their raw storage footprint is an order of magnitude smaller than data sets or other associated data. However, it becomes an obstacle when the same models are frequently updated, versioned and progressively distributed over the network to edge devices for inference. Of course, data versioning, deduplication and easy transfer is not a new problem to solve. In particular, the OCI standard does a very good job of solving these problems and has a lot of lessons learned from more than a decade of development.
In this talk we'll explore how OCI Artifacts can be used to efficiently store, version and distribute AI/ML models. We'll look at how we can break an AI model into atomic units, store them in the OCI registry, and later reassemble them locally on the target device. And when an update comes, only the difference is distributed.
Tom is a Senior Principal Software Engineer currently working on AI Model Storage at Red Hat. He has previously worked on CNCF Backstage, the Operate First SRE community, ManageIQ, and many other projects ranging from QA enablement tools to a cloud-native monitoring stack. Away from computers, he enjoys all things astronomy, organising LARP and coffee festivals and events, and bonsai.