Job Description
This is a hands-on building role: you turn raw, messy fabrication data into the clean, well-modeled, AI-ready datasets that our AI/ML and analytics workloads run on 🚀
🧑🏻💻 Responsibilities:
Build and operate ingestion, ELT/ETL, and orchestration pipelines that move data from our MongoDB Atlas operational store and other sources into our analytical and AI-serving layers
Implement layered (medallion-style) transformations with idempotent, backfillable, incrementally loaded jobs
Apply deduplication, normalization, and validation so downstream data is high-quality and trustworthy
Modernize legacy / homegrown data flows via incremental, strangler-fig migrations that keep production stable
Build embeddings and vector pipelines, and the feature/retrieval-ready datasets that RAG, semantic search, and agentic workloads depend on
Make productio...