PyCon UK 2019

Stranger things in Twitterverse
2019-09-15 , Ferrier Hall

Uncovering Twitter troll armies by monitoring and analyzing millions of tweets using Python to identify suspicious entities that intended to skew online conversations and spread misinformation.


Using Twitter Streaming API we collected and analysed millions of tweets to identify suspicious entities like bots and their impact on online conversations over a period of 3 months. We aim to understand the evolution, affiliation and participation of trolls directly or indirectly to skew public opinion and spread misinformation.

In our talk, based on the analysis, we would like to share the following:
* Identification of trending topics along with for-and-against Hashtags surrounding a given topic.
* Classification of users based on the likelihood of suspicious activity
* Monitoring of above users’ accounts to understand their individual and collective impact and implication on steering of the online conversation
* Tools used to manage, collect and analyze the dataset:
* Python3
* Pandas
* Postgres
* Jupyter
* Bokeh etc.

Based on our leanings we have created a framework with Python ecosystem that monitors tweets in real-time and comes with the following features:
* An Interface that easily configures keywords/hashtags to be monitored
* Classifies users in real-time and monitors their activity
* Identifies popular hashtags/keywords and feeds them back for monitoring their usage.
* Dashboard that displays the following insights:
* Number of unique tweets
* Percentage tweets from suspicious users.
* Volume of tweets from suspicious users, contributing to popular hashtags.
* Flagging the compromised hashtags.
* Top Users
* Top Hashtags
* Top Keyword

Takeaways for the audience
* Creating awareness about open twitter data and its research possibilities.
* A quick overview of how bots are all-pervasive and impacting online conversations.
* Provide a self-service tool built entirely in Python eco-system that empowers users and researchers to verify Twitter conversations’ authenticity.


Is your proposal suitable for beginners?: yes

Konark works as a Tech lead with Cliqz GmbH – developing privacy-focused search engine and browser technologies under the Cliqz and Ghostery brands. Helping Cliqz GmbH in making privacy a mainstream topic, Konark works on projects ranging across Privacy by design, Anonymous Data collection like Human Web, Human-web proxy network, Anti-Tracking etc.

Prior to Cliqz, Konark was working with one of the largest e-commerce website in India(Makemytrip.com) in data platform and security team, solving interesting challenges related to data warehousing, business intelligence and data security.

As an active member of the community, he loves contributing and getting involved at various fronts in whatever way he can - be it through organizing conferences for like-minded people or just disrupting social causes through technology.

His recent personal projects, in an endeavor to find and help organizations fix vulnerabilities have spanned across web browsers, health trackers, Government services, travel mobile apps to name a few.

Konark has been a speaker and presenter at numerous international conferences like Privacy Week, MRMCD, Apache Big Data, Berlin BuzzWords to name a few.

Twitter: @konarkmodi

This speaker also appears in:

As a Digital Product enthusiast, I have worked in different roles in the Product Management domain with expertise in optimization of eCommerce conversion funnels. At present, I work as a Product Analyst with one of Europe's largest e-commerce based mobility solution, FlixBus.

Outside of office, it is the ever-so interesting intersection of humanities with technology that continues to inspire me. I like to focus on social media and its impact on shaping political landscapes.

Twitter: @Pi_modi