Production reliability
I make production safer to change. Incident response, rollout design, observability that actually tells you something at 3 AM, and alerting that doesn't go off for nothing. Systems should be explainable under pressure, not just work when everything is fine.