Goals
ClickHouse Team's Mission at PostHog
ClickHouse Team's mission is to provide a storage and query engine that meets these requirements:
- Continue to meet the needs of the product today now and in the future
- Maintain and optimize our current ClickHouse deployment
- Elastically scale our capacity with little effort
- Support multiple query quality of service (QOS) guarantees (Real-time, Batch, etc.)
- Data is stored once and queryable from the appropriate tool
- Queries are optimized for cost and performance
- Tunable execution performance to allow trade-offs between cost and performance
- Storage is durable
In service of this mission, our objectives for Q2 2026 are:
Objective 1: Spread load on Clickhouse across purpose driven clusters Rory Shanks
Motivation: All in one means load cannot be spread to specific workloads or products - one workload scales all others
What we will ship:
- OPs cluster for all observability, non customer facing workloads
- Sessions cluster
- AI “heavy” events cluster
- (Insert other appropriate clusters here)
- (Zoo|Clickhouse)Keeper migration
- Cluster upgrades to get the gains that help us here
Objective 2: No one customer affects another customer Paweł Szczur
Motivation: We still have the possibility for a single expensive customer to negatively impact all other customers
What we will ship:
- Every single query is tagged across multiple dimensions
- We control the query control plane (API between the app and Clickhouse)
- We are able to throttle or drop by team,product,priority,etc.
- We will have a manual system and some sort of automated system
- Will support Robbie and co on the query performance optimization side of things
Objective 3: Operational tooling for managing our clusters is better Bryan Ciaraldi
Motivation: Normal operations and incidents are intensely time consuming and slow all of us down.
What we will ship:
- Better standardisation of configs across envs (better, more enforced IaC)
- Make sure migrations are possible and ideally delightful in this topology
- HouseKeeper - AI tooling etc. that helps to debug live issues (and any other tooling we can find)
- Ship more info into the ops cluster so we can hit it harder and build strong dashboards -
- MCP connection to OPs cluster via AWS bedrock (so we never have to limit ourselves due to data protection)
Objective 4: Data deletion is self-serve (at least internally) Rory Shanks
Motivation: This comes up all the time from customers
What we will ship:
- Cluster updates to enable part rewriting on demand
- A proper concept in place for storing scheduled deletions / being able to visualise / debug this etc.
- Some sort of self serve interface (can for starters be internal for support/ops)