RoboCon 2026

From Flaky Chaos to Clear Signals: PyCharm's UI Test Observatory
2026-03-05 , RoboCon Online

PyCharm QA team stopped chasing green and switched to the "observability over stability" approach. This talk will share our workflows for monitoring trends and tell the story of creating the 100% vibe-coded, stateless solution that builds real time views from API requests, highlights similar failures, and draws attention to regressions.


Like many teams, we used to treat UI tests as something that must be green. For months after introducing them, the PyCharm QA team fought flakiness, managed mutes across environments, and tried to keep up with monorepo changes from hundreds of developers. We shifted to monitoring trends instead of day-to-day statuses and chose a bird’s-eye view of the system over inspecting single failures in a specific build or environment.

This talk shares our approach and the lightweight tool that enables it. The TestKeeper Service is a 100% vibe-coded solution with no FTE spent. Its stateless architecture builds views in real time from API requests, with no deployment or database maintenance. Instead of showing which tests failed, our service focuses on trends, highlights similar failures, and draws attention to cases where we should reproduce the failure manually.

Attendees will learn the following:

  • The workflows we developed to enable the observability approach and complement the tool: recognising typical patterns of trends, standard steps to reproduce the issue, and distinguishing problems in the product from defects in tests
  • Real cases from PyCharm: how we manage to spot and catch regressions against the background noise of flakiness
  • Guardrails that we use to balance extending the coverage and fixing defects in tests, in addition to our overall approach to developing new tests
  • How a stateless, zero-FTE, API-based service can deliver a significant impact, and how to apply a similar design in your context

The main goal of the talk is to provide evidence that observability over stability is a valid direction for developing a testing framework, especially for UI tests and complex systems. I want to show colleagues a better alternative to spending man-hours on fixing flaky tests, and how a vibe-coded internal tool became a game changer in the quality assurance infrastructure of PyCharm.


Categorize / Tags:

ui test automation, observability, trends, flaky tests, regressions, internal tools, ide testing, continuous integration

Is this suitable for ..?: Beginner RF User, Intermediate RF User Describe your intended audience:

Test Automation Engineers, QA Engineers, QA Team Leads

Denis Mashutin is a Software Test Automation Engineer at JetBrains, responsible for PyCharm’s UI tests. A serial career switcher, he transitioned from Arabic technical translation in the Middle East into software development and has already tried out quite a few roles in IT: technical writer, documentation lead & DocOps engineer, QA engineer, test automation engineer. Along the way, he introduced docs‑as‑code practices, built automation frameworks, and developed tools to accelerate development workflows and ensure quality. He is now focused on building a robust, informative UI test infrastructure for PyCharm, with an emphasis on improving release quality and developer experience.