Data & Operations Shipped Private production system

Real-Time Telemetry Platform

Real-time telemetry at scale: 20+ event-driven ETL pipelines feeding 30+ code-generated dashboards, on a Kubernetes cluster built to stay up.

  • 30+ live dashboards
  • 20+ ETL pipelines
  • 50k+ records / day

Context

Industrial equipment never stops talking: machines on the line emit a steady stream of process telemetry, around the clock. None of it is useful until it lands somewhere queryable and shows up on a screen an operator actually watches. A drift caught on a live run chart is a scrap batch that never ships, and missing that signal is the expensive case. So the data has to move reliably, and the dashboards have to stay honest.

What I built

A real-time telemetry backbone: 20+ event-driven ETL pipelines that move equipment data off the line into a time-series store, feeding 30+ Grafana dashboards that operators and process engineers watch in real time. The pipelines are C# file-watcher services. The dashboards are generated from code, not clicked together by hand. Together they process 50,000+ records a day.

Architecture

C# file-watcher services pick up equipment output as it lands and push it into AWS Timestream, with MinIO holding the raw artifacts. On the read side, a Python build script generates the Grafana dashboards: every panel’s query against Timestream is defined in code, so 30+ dashboards stay version-controlled, diffable, and regenerable. The pipeline fleet runs on a K3s Kubernetes cluster with multi-node failover, so a single node dropping out doesn’t take telemetry down with it. I validated the migration with a 24-hour soak test before it carried production load.

Technical highlights

  • Event-driven ingestion. The pipelines react to equipment output as it arrives instead of polling on a timer, so the data stays fresh and the services sit idle when nothing is happening.
  • Dashboards as code. 30+ dashboards come out of a script, so they stay consistent and one change propagates everywhere at once. No click-by-click drift, no panel that quietly disagrees with the one next to it. I-MR (individual / moving-range) charts bring process-control thinking to live equipment data.
  • Built to stay up. Running on K3s with multi-node failover means the telemetry path survives a node loss, and the 24-hour soak test proved it before real data depended on it.
  • Time-series native. Querying Timestream directly keeps a 50,000-record-a-day feed fast enough to watch in real time, instead of timing out on a relational table.

Tradeoffs

Generating dashboards from code and running the pipelines on Kubernetes is more upfront machinery than a handful of charts would need. You can’t just drag a panel and save, and a cluster is a thing you have to operate. It pays back the moment you have 30+ dashboards that must agree, 20+ pipelines that can’t silently die, and an operation that treats the data as ground truth.

Outcome

The platform runs as live, real-time monitoring across the operation: equipment telemetry flows through 20+ pipelines into 30+ dashboards that operators and engineers trust as a shared, real-time view of process health. It runs on infrastructure built to survive a bad night, and every dashboard lives in version control where it can be reviewed and rebuilt instead of redrawn by hand.