MozFest 2022

Low-resource languages, and their open source AI/ML solutions through a radical empathy lens
Language: English (mozilla)

Since the inception of the internet 30 years ago, its open infrastructure has fueled the current growth with AI/ML. The latter has been very influential to contribute to diversifying the internet in terms of linguistic and other forms of digital access. The only caveat however is that the AI/ML infrastructure is very business driven as opposed to civil society driven. That is one the key reasons why the majority of the minorized (indigenous, endangered and low-resource) languages are sidelined. In the current state it has become a labyrinth, for anyone who wants to become a first generation digital language-activist, it has become difficult for them to understand "where do I to start?".

This session features a MozFest Trustworthy AI working group project.


How will you deal with varying numbers of participants in your session? What if 30 participants attend? What if there are 3?

We are open to a small to large group and would have no issues. The discussions being in English might be a barrier and we are conscious of the challenge.

What language would you like to host your session in?

English

What happens after MozFest? We're hoping that many efforts and discussions will continue after MozFest. Share any ideas you already have for how to continue the work from your session.

We plan to leave our contacts along with a GitHub (or another public page) for further collaborations and continuation of conversations beyond MozFest.

What is the goal and/or outcome of your session?
  • Understanding the status quo of low-resource languages and creation of AI/ML solutions for the same through the lens of low-resource digital language-activists.
  • Auditing tools such as Common Voice to create a benchmark for low-resource languages.
  • Engaging low-resource digital language-activists for a long-term project for creation of a framework for the need for foundational technology to kickstart AI/ML projects in low-resource languages
Why did you choose that space? How does your session align with the space description?

As our primary focus is to discuss and identify the needs and challenges with low-resource languages and communities, the larger discourse of decolonization of tech and society is where our conversations would rightly fit in. We are quite conscious of the wide spectrum of lack of accessibility because of linguistic and cultural barriers in addition to physical and mental disabilities.

We think this space would allow us to engage other peers—activists, artists, academics and other researchers for long-term collaborations beyond the discussion.

See also: before <> project descriptions (235.3 KB)

Documentary filmmaker, researcher, language-archivist, and 2017 National Geographic Explorer. Recent 📚 MarginalizedAadhaar 🔓 OpenSpeaks