About

Staff-level SRE focused on reliability, identity, Kubernetes, and infrastructure leverage.

Zak Hassan is a staff-level site reliability engineer and production engineer with 10+ years building and operating mission-critical backend infrastructure at internet scale: identity systems processing hundreds of millions of authentications, enterprise platforms serving 400M+ global users, social platforms serving 200M+ users, Kubernetes fleets, real-time data pipelines, and GPU-backed ML infrastructure. I am strongest where product scale, operational risk, developer velocity, and cloud cost all collide.

Public experience

Previous organizations include Red Hat, Hootsuite, SAP, and Workday.

I have worked on open-source enterprise platforms, social media infrastructure serving 200M+ users, enterprise SaaS platforms serving 400M+ global users, identity and backend systems processing massive authentication volume, and Kubernetes developer platforms used across large engineering teams.

Owns production reliability for critical backend services, identity systems, real-time data platforms, and multi-cloud fleets.

Builds Kubernetes platforms, reusable Terraform modules, service mesh foundations, secret rotation, and zero-trust infrastructure patterns.

Leads high-risk migrations, progressive delivery systems, SLO programs, incident response, capacity planning, and cost optimization.

Builds AI-powered operations workflows for log analysis, cloud cost telemetry, remediation playbooks, and production debugging.