BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.com//pyconde-pydata-berlin-2023//talk//AUJYP7
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-pyconde-pydata-berlin-2023-AUJYP7@pretalx.com
DTSTART;TZID=CET:20230418T103000
DTEND;TZID=CET:20230418T110000
DESCRIPTION:Large generative models rely upon massive data sets that are co
 llected automatically. For example\, GPT-3 was trained with data from “C
 ommon Crawl” and “Web Text”\, among other sources. As the saying goe
 s — bigger isn’t always better. While powerful\, these data sets (and 
 the models that they create) often come at a cost\, bringing their “inte
 rnet-scale biases” along with their “internet-trained models.” While
  powerful\, these models beg the question — is unsupervised learning the
  best future for machine learning?  \n\nML researchers have developed new 
 model-tuning techniques to address the known biases within existing models
  and improve their performance (as measured by response preference\, truth
 fulness\, toxicity\, and result generalization). All of this at a fraction
  of the initial training cost. In this talk\, we will explore these techni
 ques\, known as Reinforcement Learning from Human Feedback (RLHF)\, and ho
 w open-source machine learning tools like PyTorch and Label Studio can be 
 used to tune off-the-shelf models using direct human feedback.
DTSTAMP:20260309T095309Z
LOCATION:B07-B08
SUMMARY:Improving Machine Learning from Human Feedback - Erin Mikail Staple
 s\, Nikolai
URL:https://pretalx.com/pyconde-pydata-berlin-2023/talk/AUJYP7/
END:VEVENT
END:VCALENDAR
