2025-09-19 –, Space 2
What do you do when you need to build data products, but don’t have any data? Learn how to generate complex synthetic data using Faker. We will generate typical user journeys through a website, creating a synthetic data stream that could be used for development, testing or modelling.
Synthetic data can be useful in situations where you need to start working with data but can’t or do not have access to the underlying data. This could be because you are waiting for access or working with PII, confidential or regulated data so real data can't or shouldn't be used during development. In this situation, being able to generate realistic data for development, testing and even modelling the real data could prove useful and get around blockers.
The specific use case we’ll be looking at is event data generation from website visitor tracking. In this situation we usually have a very good idea of the structure of event data we’ll be working with and even typical user journeys. We can build quite complex synthetic data that mimics website visitor traffic and even add interesting statistical patterns that mirror the variety of pathways users take through a website.
We'll primarily use Faker, a python library designed to generate synthetic data. We will work through a project that can take multiple specifications that configure user journeys, event schemas and randomness and generate a stream of events which match these specifications. This stream of data could be used for development, testing or data modelling and removes the need to use live data.
Outline
1. Understanding the use case
2. Faker to the rescue
3. Fake data initialisation
4. Faker internals
5. Creating the fake event stream
Intermediate
I lead the Engineering function at Tasman Analytics, a boutique data consultancy. We act as an interim/fractional data team and are passionate about helping clients leverage the power of their data.
Personally, I have a background of mechanical engineering and have worked across a range of sectors including sustainability, energy, property, construction and architecture. I am an engineer at heart and perennially look to hone the craft of engineering.