Software Engineer, Kernel Reliability

Cerebras

Full-time Other-General
Apply Now
Location
Remote, Remote, Canada
Posted
June 01, 2026

Job Description

About The Role

We’re looking for a deeply technical, hands-on software engineer to join our on-field Kernel Reliability team. You'll help tackle a critical challenge: improving the reliability of our advanced compute clusters and the underlying inference, training, and internal production services. In this role, you'll work close to the code and design solutions that will scale with our rapidly growing system production and software service offerings. If you have strong fundamentals in systems, debugging, and failure analysis—and enjoy building tools and solving hard reliability problems—we want to hear from you. New college graduates are welcome.

Responsibilities

  • Contribute to the technical roadmap and execution for kernel-centric reliability of our internal and customer-facing systems.
  • Partner with System and Cluster Operations teams to reduce system and service downtime after failure through tooling, analysis, and hands-on debugging ...