Blog
Engineering notes and opinions on AI-assisted SRE, incident response and the causality chain behind real outages.
The causality chain: from commit to incident
AI Operations
Most production incidents follow a recent change. The fastest path to root cause is correlating the delivery chain — commit to build to deploy to runtime — which single-layer tools cannot see.
Summaries are not root cause
AI Operations
Most "AI for DevOps" tools summarize what already happened. Diagnosis means naming the change that caused it, with evidence — and that is a higher bar.