MozFest 2022

Analyzing shadow-ban on TikTok: the "TikTok Observatory"
Language: English (mozilla)

The workshop will show the participants how to collect and analyze data to investigate TikTok's algorithm.

In the U.S., there is evidence that TikTok has become a hub for political disinformation. Even more concerningly, there are suspicions that the algorithm is explicitly biased to align with the political interests of the PCC.

These accusations are serious, but they only rely on leaked information or anecdotal evidence, and the company denies those claims. The system remains entirely opaque, and there are currently no tools or mechanisms for civil society and regulators to collect reliable evidence to hold the platform accountable.

The TikTok Observatory is an open-source platform to monitor TikTok's recommendation algorithm behavior and personalization patterns.

It enables researchers and journalists to investigate which content is promoted or demoted on the platform, including content regarding politically sensitive issues.

With the browser extension installed, every TikTok video watched from that browser would be saved on a personal page as long as the suggested videos. Later on, the investigator can retrieve the evidence in CSV format by using the public API to compare two different profiles.

In the tutorial, we'll collect data and then display our toolchain for data analysts (based on Gephi and Python Notebook) or with a more straightforward tool such as LibreOffice.


What is the goal and/or outcome of your session?:

The session's goal is to install the browser extension, understand what data it is parsing precisely, retrieve those data with the public API, and analyze them with a minimum analysis pipeline.

The introduction of the available literature on TikTok's algorithm will show why an independent analysis of all the shadow-banned words is needed. On the other hand, replicating the data collection and the analysis pipeline will allow the participants to repeat the entire methodology autonomously, actively contributing to the Observatory on TikTok.

In general, involving other people in this open-source project will accomplish one of our team's missions: build an active community monitoring with adequate technologies the censorship on the platform.

We hope to generate a genuine and multicultural discussion on which words and which country is better to focus our following research on.

Why did you choose that space? How does your session align with the space description?:

Social media platforms increasingly use Shadow-banning to limit the spread of what is considered borderline content, including nudity, disinformation, or violent content. It is a middle ground between deleting content and doing nothing.

We are not arguing that there should be more or less moderation, nor debating what should be considered disinformation or where the line should be drawn to define pornography.
Instead, we demand any such form of algorithmic curation to be transparent. This is an important safeguard to prevent opaque moderation pipelines from being abused by platforms, governments, or coordinated attackers to silence political speech or social movements.

We want to achieve good visibility and weigh in the current regulatory reforms meant to make platforms more accountable for their users with our project.

How will you deal with varying numbers of participants in your session? What if 30 participants attend? What if there are 3?:

The more diverse and large the audience, the more interesting the research we can produce collectively.
Each participant will have the chance to go thru the whole procedure, testing the shadow-ban of some words independently, but at the end of the workshop, we will have a collective discussion on the overall results.
Having fewer attendances instead will allow us to deepen the request of every single person one by one altogether.

What happens after MozFest? We're hoping that many efforts and discussions will continue after MozFest. Share any ideas you already have for how to continue the work from your session.:

In the coming months, we will launch a data-donation campaign, where willing users will contribute anonymized data.
It will allow us to monitor volunteer users' recommendations on the TikTok mobile app, test how the algorithm behaves in real life. This will also be a novel contribution, allowing us to test the personalization patterns in real conditions.

We are particularly interested in exploring non-English content and in countries where political censorship is more widespread. For instance, we hope to have a multicultural audience aware of their political context well enough to indicate a direction to follow with our investigations. Anyway, also the help from those who speak English or European languages remains precious. Unfortunately, no country is immune to algorithmic censorship nowadays.

There are no public investigations to hold TikTok accountable for their moderation practices in most of these cases. Together, we can!

What language would you like to host your session in?:

English