Location
multan, multan, Pakistan
Posted
June 15, 2026
Job Description
We're Hiring: Terminal Bench Expert (Contract | Remote
contribute to cutting-edge AI evaluation systems by designing real-world benchmark tasks that challenge advanced AI agents. If you're passionate about debugging, system analysis, infrastructure, data pipelines, or AI evaluation, this opportunity is for you.
Position Details
Role: Terminal Bench Expert
Employment Type: Contractor Assignment
Duration: 5 Weeks
Location: Remote (India, Bangladesh, Brazil, Colombia, Egypt, Ghana, Indonesia, Kenya, Nigeria, Pakistan, Turkey, Vietnam)
Experience: 3โ10 Years
Start Date: Immediate
Commitment: Full-time (40 hrs/week)
What You'll Do
โ Design and develop realistic AI benchmark tasks
โ Create debugging, investigation, and system failure scenarios
โ Define evaluation criteria and validation logic
โ Document solutions and technical workflows
โ Collaborate with reviewers to improve task quality and difficulty
Ideal Candidate
โ Strong softwa...
contribute to cutting-edge AI evaluation systems by designing real-world benchmark tasks that challenge advanced AI agents. If you're passionate about debugging, system analysis, infrastructure, data pipelines, or AI evaluation, this opportunity is for you.
Position Details
Role: Terminal Bench Expert
Employment Type: Contractor Assignment
Duration: 5 Weeks
Location: Remote (India, Bangladesh, Brazil, Colombia, Egypt, Ghana, Indonesia, Kenya, Nigeria, Pakistan, Turkey, Vietnam)
Experience: 3โ10 Years
Start Date: Immediate
Commitment: Full-time (40 hrs/week)
What You'll Do
โ Design and develop realistic AI benchmark tasks
โ Create debugging, investigation, and system failure scenarios
โ Define evaluation criteria and validation logic
โ Document solutions and technical workflows
โ Collaborate with reviewers to improve task quality and difficulty
Ideal Candidate
โ Strong softwa...