Methods for Text Style Transfer: Text Detoxification Case PyCon DE & PyData Berlin 2023

Methods for Text Style Transfer: Text Detoxification Case
.ical

2023-04-18 14:10–14:40, A1

Global access to the Internet has enabled the spread of information throughout the world and has offered many new possibilities. On the other hand, alongside the advantages, the exponential and uncontrolled growth of user-generated content on the Internet has also facilitated the spread of toxicity and hate speech. Much work has been done in the direction of offensive speech detection. However, there is another more proactive way to fight toxic speech -- how a suggestion for a user as a detoxified version of the message. In this presentation, we will provide an overview how texts detoxification task can be solved. The proposed approaches can be reused for any text style transfer task for both monolingual and multilingual use-cases.

Firstly, we will shortly introduce the research direction of NLP for Social Good. Then, we will show the main direction of research in text style transfer field. This field suffers from the lack of parallel data. We will describe our approach for such parallel dataset collection and show that it can be applied for any language. Then, we will show how monolingual, multilingual, and cross-lingual models can be trained for texts detoxification. In the end, we will discuss ethical issues connected with this task and tackling of toxic and hate speech in general. The whole presented work is based on the peer-reviewed papers from ACL and EMNLP conferences.

Expected audience expertise: Domain:

Intermediate

Expected audience expertise: Python:

Intermediate

Abstract as a tweet:

How to detoxify texts? How to collect parallel corpus for text style transfer task? How to transfer the knowledge of a style between languages? We answer these questions in this talk.

Public link to supporting material:

https://github.com/dardem/text_detoxification

Daryna Dementieva

I am a postdoctoral researcher at TUM. Currenlty, I am involved into the project of eXplainable AI. In 2022, I obtained my PhD under the supervision of Pr. Alexander Panchenko, Skoltech. My PhD research was connected with such important sociological issues as Fake News Detection and Texts Detoxification. More broadly, I am super interested in the NLP for Social Good research direction. Besides academical experience, I also was involved in several industrial projects in different companies: Visiology, Moscow, Russian Federation; Beiersdorf, Hamburg, Germany. Now, obtained industrial experience helps me a lot in my research.

Methods for Text Style Transfer: Text Detoxification Case .ical 2023-04-18 14:10–14:40, A1

Methods for Text Style Transfer: Text Detoxification Case
.ical

2023-04-18 14:10–14:40, A1