Join Our Team
About the Role
We are looking for a Data Engineer with strong experience in building scalable data pipelines and exposure to Machine Learning workflows.
This role focuses on designing and maintaining robust data infrastructure while supporting data-driven and AI-powered applications. You will work closely with data scientists, engineers, and product teams to ensure data is reliable, accessible, and ready for advanced use cases.
We are looking for someone who combines strong engineering fundamentals with the ability to support ML pipelines and data workflows in production environments.
Key Responsibilities:
- Design, build, and maintain scalable data pipelines using Python and Airflow
- Develop and optimize ETL/ELT processes for structured and unstructured data
- Collaborate with data science teams to support Machine Learning workflows
- Ensure data quality, reliability, and performance across systems
- Work with large datasets and optimize queries and transformations
- Integrate data from multiple sources and external systems
- Monitor and improve pipeline performance and reliability
- Support deployment and maintenance of data-driven and ML-enabled applications
Must-Have:
- 4+ years of experience in Data Engineering or similar roles
- Strong proficiency in Python for data processing and pipeline development
- Hands-on experience with Apache Airflow (or similar orchestration tools)
- Experience building and maintaining ETL/ELT pipelines in production
- Strong knowledge of SQL and relational databases
- Experience working with large-scale datasets
- Exposure to Machine Learning workflows or data pipelines supporting ML models
- Experience working with cloud environments (AWS, GCP, or Azure)
- Strong problem-solving skills and ability to work independently
Nice to Have:
- Experience with ML frameworks (TensorFlow, PyTorch, Scikit-learn)
- Experience with data warehouses (Snowflake, BigQuery, Redshift)
- Experience with streaming technologies (Kafka, Kinesis)
- Familiarity with feature engineering and model data preparation
- Experience with CI/CD pipelines for data workflows
