Steven Kolawole has his technical skillset cuts across Data Science and Software Engineering, with a bias for ML Research these days. His research interests focus on resource-efficient machine learning in terms of computational resources and low-resource/limited labeled data.
He is and has been heavily involved in varieties of ML subfields including ML Engineering, Software Engineering, Data Engineering, Data Science/Analytics, and Cloud Computing.
Steven is also big on knowledge sharing via community mentorship and collective growth, open-source development, meetups facilitation, speakership, technical writing, research, and he gets kicks from helping tech muggles find their feet.
The use of transfer learning has begun a golden era in applications of ML but the development of these models “democratically” is still in the dark ages compared to best practices in SWE. I describe how methods of open-source SWE can allow models to be built by a distributed community of researchers.
Over the past few years, it has become increasingly common to use transfer learning when tackling machine learning problems (e.g. the BERT model on HuggingFace Hub has been downloaded tens of millions of times). However, pre-training often involves training a large model on a large amount of data. This incurs substantial computational (and therefore financial) costs; for example, Lambda estimates that training the GPT-3 language model would cost around $4.6 million. As a result, the most popular pre-trained models are being created by small teams within large, resource-rich corporations. This means that the majority of the research community is excluded from participating in the design and creation of these valuable resources.
Here, I elaborate on why we should develop tools that will allow us to build pre-trained models in the same way that we build open-source software. Specifically, models should be developed by a large community of stakeholders who continually update and improve them. Realizing this goal will require porting many ideas from open-source software development to building and training models, which motivates many threads of interesting research and opens up machine learning research for much larger participation.