, Titanium [2nd Floor]
A data scientist builds a Streamlit or Dash prototype, the business wants to validate it, and the hard parts begin: getting access to live data, making the app available company-wide, and ensuring every user only sees what they are allowed to see. Following "best practices" turn a simple demo into weeks of platform work, leaving data scientists frustrated and blocking them from shipping apps to end users.
In this talk we will live-demo Merck's self-service app service we have developed and hardened over multiple years. It lets teams deploy Streamlit (and friends) in 3 minutes while meeting best practices like SSO, CI/CD, and governed data access control. The platform has become essential for Merck to ship data apps at scale: in 2025 it powered 750+ active apps reaching 8,000+ unique end users.
Under the hood, we show: how a use-case based access model enables scoped resource permissions so apps can safely access data on-behalf of the user. We also show starter templates that generate a deployable Git repo with example pages (e.g. Snowflake access or internal LLM chatbot). Finally, we cover the guardrails needed to operate this safely.
What you will learn: a cost-effective reference architecture based on AWS that you can adapt to your hyperscaler or platform, practical patterns for balancing the trade-off between central control and decentral freedom, and how templates and CI/CD help teams iterate quickly without compromising security or reliability.
This session is for anyone who has built a Streamlit (or Dash, R Shiny, FastAPI, React) prototype and then hit the wall when it needed to be shared with real users: access to live data, SSO, permissioning, deployment, and operational guardrails.
We will present the workflow and the architecture from both sides: as a data scientist shipping an app, and as a platform admin operating the service safely at scale.
What we will demo
We will demo the end-to-end workflow from zero to a running app using our internal app service. The platform includes a web console for self-service provisioning and configuration and the deployment runtime managing the state of the application.
- Using the web console to create and configure a new app from a framework template (Streamlit, Dash, R Shiny, FastAPI, React).
- How a Git repository is created and the first version is deployed behind the scenes, including a working starter app with example pages.
Key design decisions (the parts that are usually hard)
- Identity propagation: the app receives the signed-in user identity from SSO and uses it for downstream authorization.
- Authorization at the data layer: dataset permissions are scoped to use-case resource, making sure tokens can not be exploited.
- Safe multi-tenancy: per-app isolation plus resource limits to prevent noisy-neighbor problems.
- Repeatable delivery: templates plus CI/CD conventions so a new app starts from a working, deployable baseline.
- Day-2 operations: guardrails like quotas, rate limiting, and idle shutdown to keep the platform reliable and cheap.
Running at scale
- Production usage: 750+ active apps and 8k+ unique end users (2025).
- Infrastructure run rate under 10k USD per month (excluding engineering time).
Who should attend
- Data scientists and analysts who want to ship apps beyond a demo.
- Data platform and DevOps engineers building self-service tooling for governed environments.
- Teams standardizing how internal data & AI products are delivered to business users.
Takeaways
- For data scientists: what a good internal app hosting platform should provide, and which requirements you should ask your platform team for (governed on-behalf of data access, templates, CI/CD, guardrails).
- For platform teams: a blueprint you can adapt beyond AWS, including the architecture and tradeoffs necessary to operate fine-grained authorization and a multi-tenant runtime at scale.
If you do not have such an app platform in your company yet, use this talk as a checklist to start the conversation with your IT or platform teams. :-)
Bernhard is a Senior Data Scientist at Merck with a PhD in deep learning and over 7 years of experience in applying data science and data engineering within different industries. For more information you can connect with him on LinkedIn. 🙂
As the Global Head of Platform Products Portfolio, Nicolas leads high performing teams that design, implement and maintain Merck's global data, analytics and AI ecosystem UPTIMIZE.