2023-05-18 –, Saphire B - PyData
More than 5 million people play Nordcurrent mobile games every month. The specificity of free-to-play games is that less than 10% of players make purchases. It is essential to retain paying players and keep them engaged as long as possible. To do that, we built a purchase prediction model.
We store data and make the most of feature engineering in Clickhouse. Apache Airflow orchestrates pipelines. Usually, we use CatBoost for Machine Learning. Pydantic and ClearML, on top of AWS S3, manage model files, training metrics, and configs. The quality in production is evaluated using dashboards in Apache Superset.
The architecture allows us to build fully reproducible ML pipelines. The learning process can be horizontally scaled to select the optimal hyperparameters. At the inference stage, you do not need to worry that the model was trained in some Jupiter Notebook, and it is unclear what to do if it suddenly breaks in a month.
Intermediate
What topics define your talk the best?:PyData
Senior Data Scientist at Nordcurrent. Before that, I was a Chief Product Officer and Head of ML at GOSU Data Lab. We built a Voice Assistant for Gamers. The company raised $5M in investments and was acquired in 2021.