Contributing to an open-source content library for NLP
04-19, 14:10–15:40 (Europe/Berlin), A03-A04

Bricks is an open-source content library for natural language processing, which provides the building blocks to quickly and easily enrich, transform or analyze text data for machine learning projects. For many Pythonistas, contributing to an open-source project seems scary and intimidating. In this tutorial, we offer a hands-on experience in which programmers and data scientists learn how to code their own building blocks and share their creations with the community with ease.


We will prepare some easy-to-use cases so that attendees with novice machine learning and NLP skills can participate in the session. A basic understanding of Python is required, but everyone who wants to learn more about machine learning, NLP, or open-source contributions is welcome.

A brick is a modular piece of software that enriches, transforms, or analyzes text data for natural language processing, a sub-domain of machine learning. What sets a brick apart from a simple code snippet is its suitability for multiple execution environments. A brick module can also be executed in a demo playground, allowing users to try out different inputs to see if the brick meets their needs.

In this session, we will begin by outlining some ideas for building a brick. After substantiating our ideas, we will make the code usable in different environments, such as the playground for testing inputs. Since SpaCy is commonly used in many NLP projects, we will also build a variant of the code that takes a SpaCy document as input. Add some documentation, and voila! You now have a brick.


Expected audience expertise: Domain

Novice

Expected audience expertise: Python

Novice

Abstract as a tweet

Learn to build amazing open-source enrichments for natural language processing!

Public link to supporting material

https://github.com/code-kern-ai/bricks

Leonard Püttmann studied economics at the Hochschule Düsseldorf. During a specialization course there he fell in love with all things ML, especially when it comes to natural language processing. After studying, he joined the company Kern AI as a data scientist and now works as a developer advocate, where he is connecting people to topics like ML and programming.