DevOps Engineer - AI Model Evaluator

Obsidian

Full-time Software Development, Software Architecture & Engineering

Apply Now

Location

helsinki, uusimaa, Finland

Posted

July 01, 2026

Job Description

About the Role Mercor is partnering with a leading AI research lab to support a Frontier Code Agents project. 
Contributors help evaluate and improve frontier AI coding models through structured technical assessments. 
The work focuses on realistic infrastructure engineering workflows and model evaluation. 
Spots are limited and filling quickly on a first come, first serve basis. 
What You'll Do Use frontier AI coding agents to complete and evaluate complex infrastructure engineering tasks. 
Review model-generated implementations involving cloud platforms, Kubernetes, CI/CD systems, observability, and infrastructure automation. 
Identify bugs, edge cases, reliability issues, and failure modes. 
Compare outputs from multiple frontier models and assess their strengths and weaknesses. 
Apply professional engineering judgment to realistic infrastructure engineering scenarios. 
<...
                    

Apply Now Similar Jobs

Job Details

Job Type

Full-time
Category

Software Development, Software Architecture & Engineering
Date Posted

July 01, 2026
Application Deadline

August 10, 2026