Job Description
Location: Remote & Hybrid
Experience: 5+ Years
Responsibilities:
Design, develop, and maintain scalable ETL/ELT pipelines using PySpark and AWS Glue.
Build and optimize data models in Snowflake to support reporting, analytics, and machine learning workloads.
Automate data ingestion from various sources including APIs, databases, and third-party platforms.
Esure high data quality and implement data validation and monitoring processes.
Collaborate with data analysts, data scientists, and other engineers to understand data requirements and deliver high-quality solutions.
Work with large datasets in structured and semi-structured formats (JSON, Parquet, Avro, etc.).
Participate in code reviews, performance tuning, and debugging of data workflows.
Implement security and compliance measures across data pipelines and storage.