
Building Robust Data Pipeline with Apache Spark
Resilient Spark pipelines, data platform reliability, and operational design.
Speakers: Zak Hassan
Speaking Archive
A curated wall of past presentations covering Apache Spark reliability, Prometheus monitoring, OpenShift-native ML workflows, GPU observability, MLFlow, and anomaly detection.
Presentation Signal Wall
Each talk is a different telemetry stream: distributed data systems, Prometheus, ML platform operations, GPU monitoring, and anomaly detection.

Resilient Spark pipelines, data platform reliability, and operational design.
Speakers: Zak Hassan

Metrics architecture for Spark workloads and high-cardinality operational signals.
Speakers: Zak Hassan and Diane Feddema

Experiment tracking, Kubernetes operators, and reproducible ML workflows.
Speakers: Zak Hassan and Mani Parkhe

Anomaly detection patterns for noisy logs and operational triage.
Speakers: Zak Hassan

GPU telemetry, TensorFlow workloads, and Prometheus-based visibility.
Speakers: Zak Hassan and Diane Feddema

Using unsupervised language techniques to surface suspicious log patterns.
Speakers: Zak Hassan and Michael Clifford

MLFlow operations on Kubernetes-native infrastructure.
Speakers: Zak Hassan and Hema Veeradhi
Compare Notes
These talks are older public artifacts, but the operating themes still map directly to how I think about platform reliability, telemetry, and infrastructure leverage today.
Connect on LinkedIn