Senior Systems Engineer - L3 Operations (Data Analytics & AI) (Ref 26210a)

jobline resources pte. ltd.

Full-time Other-General
Apply Now
Location
singapore, singapore, Singapore
Posted
June 07, 2026

Job Description

Responsibilities

• Monitor and maintain production data pipelines to ensure 99.9% uptime and optimal performance

• Implement comprehensive logging, alerting, and monitoring systems using Application monitoring tools

• Perform regular health checks performance, job execution times, and resource utilization to identify and resolve bottlenecks proactively

• Manage incident response procedures for pipeline failures, including root cause analysis, resolution, and post-incident reviews

• Establish and maintain disaster recovery procedures and backup strategies for critical data assets within the Databricks environment

• Conduct regular performance tuning of Spark jobs and Databricks cluster configurations to optimize cost and execution efficiency

• Maintain comprehensive documentation for operational procedures, runbooks, and troubleshooting guides

• Coordinate scheduled maintenance windows and system upgrades with minimal business i...