Erin Mikail Staples PyCon DE & PyData Berlin 2023

Erin Mikail Staples
.ical

Erin Mikail Staples is a very online individual passionate about facilitating better connections online and off. She’s forever thinking about how we can communicate, educate and elevate others through collaborative experiences.

Currently, Erin is a Senior Developer Community Advocate at Label Studio. At Label Studio — she empowers the open-source community through education and advocacy efforts. Outside of her day job, Erin is a comedian, graduate technical advisor, content creator, triathlete, avid reader, and dog parent.

Most importantly, she believes in the power of being unabashedly "into things" and works to help friends, strangers, colleagues, community builders, students, and whoever else might cross her path find their thing.

Twitter handle:

@erinmikail

Github:

https://github.com/erinmikailstaples

LinkedIn:

https://www.linkedin.com/in/erinmikail/

Session

04-18

10:30

30min

Improving Machine Learning from Human Feedback

Erin Mikail Staples, Nikolai

Large generative models rely upon massive data sets that are collected automatically. For example, GPT-3 was trained with data from “Common Crawl” and “Web Text”, among other sources. As the saying goes — bigger isn’t always better. While powerful, these data sets (and the models that they create) often come at a cost, bringing their “internet-scale biases” along with their “internet-trained models.” While powerful, these models beg the question — is unsupervised learning the best future for machine learning?

ML researchers have developed new model-tuning techniques to address the known biases within existing models and improve their performance (as measured by response preference, truthfulness, toxicity, and result generalization). All of this at a fraction of the initial training cost. In this talk, we will explore these techniques, known as Reinforcement Learning from Human Feedback (RLHF), and how open-source machine learning tools like PyTorch and Label Studio can be used to tune off-the-shelf models using direct human feedback.

PyData: Machine Learning & Stats

B07-B08

Erin Mikail Staples .ical

Session

Erin Mikail Staples
.ical