PyCon UK 2019

What do travel, food & health websites have in common? Auditing websites & apps for privacy leaks
09-15, 17:00–17:30 (Europe/London), Assembly Room

Organizations with digital products that lack even the most basic data security practices are living in a utopian world where people leave their safe open and never expect a burglar to walk in.


With the advent of SaaS, companies are relying more on more on third-party services for CDNs, analytics, recommendations, loyalty, advertisements, email marketing etc. But not so much effort is being put in ensuring what data is being shared with these third-parties.

As an example:

It is a common practice to load fonts from third-party CDNs. But is it necessary for the website to share sensitive data like users' booking IDs in order to load the fonts from a CDN?

These data leaks are bad in itself, but in the GDPR-era, companies could face huge penalties for such accidental leaks.

At PyCon UK I would like to showcase our work, on how we are creating Babel: a Network analyses framework built on top of http://mitmproxy.org/. Think about this as Local-Sheriff but outside the browser.

It cuts down all the clutter of using network inspecting tools and provides a search interface to users which shows the ugly world of data collection, third-parties and how using your app, sensitive data is being shared with companies.

Everything is done locally, and no data is sent out to our servers. (Actually they don't even exist)

Insights that the Babel presents:
* Hostnames that application connects to.
* Classifies them as First-Party and Third-parties used by the application.
* Using data from open-source project WhoTracksMe to map third-party domains to company names.
* Local search interface to look for PII and how it is being shared with different companies.
* URLs being shared with third-parties via network headers / query parameters.
* URLs that contain sensitive data, are they behind a login page or not.
* Values like EmailID etc being shared with third-parties.
* Configuration to flag, pre-defined list of values, hostname that are not supposed to be transferred.
* Identifiers which do not look like PII, but can be used for tracking the user on the internet. Example cookie synching, long-term user identifier.
* Adoption of basic security headers via observatory.mozilla.org

Takeaways for the audience:
* Common pitfalls while using third-parties and how apps end up accidentally leaking sensitive data.
* How can they audit partners before implementing them in production.
* How they can audit their own apps and bring in privacy checks as part of their software life cycle.


Is your proposal suitable for beginners? – yes

Konark works as a Tech lead with Cliqz GmbH – developing privacy-focused search engine and browser technologies under the Cliqz and Ghostery brands. Helping Cliqz GmbH in making privacy a mainstream topic, Konark works on projects ranging across Privacy by design, Anonymous Data collection like Human Web, Human-web proxy network, Anti-Tracking etc.

Prior to Cliqz, Konark was working with one of the largest e-commerce website in India(Makemytrip.com) in data platform and security team, solving interesting challenges related to data warehousing, business intelligence and data security.

As an active member of the community, he loves contributing and getting involved at various fronts in whatever way he can - be it through organizing conferences for like-minded people or just disrupting social causes through technology.

His recent personal projects, in an endeavor to find and help organizations fix vulnerabilities have spanned across web browsers, health trackers, Government services, travel mobile apps to name a few.

Konark has been a speaker and presenter at numerous international conferences like Privacy Week, MRMCD, Apache Big Data, Berlin BuzzWords to name a few.

Twitter: @konarkmodi

This speaker also appears in: