Site Reliability Engineer

E-IT

Full-time Other-General

Apply Now

Location

ottawa, on, Canada

Posted

May 22, 2026

Job Description

Job Description  
Key Responsibilities:  
Incident Management and Reliability:  Lead the incident management process, ensuring high availability and performance of the applications. Develop and implement SRE practices to improve system reliability and resilience. 
Monitoring and Observability:  Utilize Dynatrace, Splunk, and Grafana to monitor system health, detect anomalies, and provide actionable insights for performance optimization. 
Root Cause Analysis:  Conduct thorough root cause analysis of incidents and outages, developing long-term solutions to prevent recurrence. 
DevOps Practices:  Collaborate with development and operations teams to streamline CI/CD pipelines, automate workflows, and implement infrastructure as code (IaC) for efficient service deployment and management. 
Networking Expertise:  Provide expertise in networking technologies (Cisco, Arista, AVI, etc.), ensuring...
                    

Apply Now Similar Jobs

Job Details

Job Type

Full-time
Category

Other-General
Date Posted

May 22, 2026
Application Deadline

July 01, 2026