Location
regina, division no. 6, Canada
Posted
June 06, 2026
Job Description
What You’ll Do
You’ll operate at the intersection of software engineering and systems engineering, building resilient systems that scale, self-heal, and empower developers to ship safely.
Reliability Engineering
- Define and manage SLIs, SLOs, and error budgets
- Reduce MTTD, MTTA, and MTTR through structured incident response
- Conduct blameless postmortems and drive preventative improvements
- Champion reliability in architectural reviews and production readiness
Observability & Monitoring
- Design actionable, symptom-based alerts (not noise)
- Build dashboards and tracing systems using tools like CloudWatch, Prometheus, Grafana, New Relic, X-Ray, ADOT
- Implement synthetic monitoring to simulate real user journeys (URLs, clickpaths, APIs)
- Ensure full observability coverage across critical paths
Cloud & Infrastructure
- Operate and optimize AWS environment...