About US:-
Job Description:-
We are seeking a highly skilled and experienced Data Engineering Lead with strong expertise in AWS data services and retail data ecosystems. In this role, you will lead the design, development, and optimization of scalable data pipelines responsible for ingesting and transforming data from MMS (Merchandise Management Systems) and POS (Point of Sale) systems into a centralized Operational Data Store (ODS) to support downstream applications and analytics use cases.
You will play a critical role in building a robust, high-performance data platform using AWS-native services, ensuring data quality, reliability, and real-time or near real-time availability for business operations.
Responsibilities:-
Design and implement scalable ETL/ELT pipelines to ingest data from MMS / POS or third practice systems into AWS-based data platforms.
Build and maintain a centralized Operational Data Store (ODS) using Couchbase or similar NoSQL technologies, optimized for low-latency application access.
Develop and optimize data processing workflows using Apache Spark / PySpark and AWS services such as AWS Glue and Amazon EMR.
Create denormalized, API-ready data models aligned with downstream microservices and application consumption patterns.
Implement idempotent processing, CDC merge (upsert) strategies, and data reconciliation mechanisms to ensure consistency across batch and streaming pipelines.
Leverage Amazon S3, Amazon Redshift, or Amazon Aurora for efficient storage and querying of structured and semi-structured data.
Implement data ingestion patterns (batch and streaming) using tools like Amazon Kinesis or AWS Lambda where applicable.
Apply performance tuning and optimization techniques to improve pipeline efficiency, scalability, and cost-effectiveness.
Define and enforce data governance, data quality, and metadata management standards across the data platform.
Collaborate with DevOps teams to design and maintain CI/CD pipelines using tools like AWS CodePipeline, GitLab, or similar.
Conduct peer reviews and provide technical leadership and mentorship to the data engineering team.
Collaborate with microservices and application teams to ensure seamless integration with the ODS via APIs and event streams.
Requirements:-
5+ years of experience in data engineering, with at least 2+ years in a lead role.
Strong hands-on experience with AWS data services such as AWS Glue, Amazon S3, Amazon Redshift, and Amazon EMR.
Proficiency in PySpark / Spark for large-scale data processing and optimization.
Experience designing and implementing ODS layers using NoSQL databases, preferably Couchbase, or similar (DynamoDB, MongoDB).
Strong expertise in ETL/ELT design patterns, data ingestion, and transformation pipelines.
Experience working with MMS, POS, or retail transaction data is highly preferred.
Hands-on experience with CI/CD pipelines (GitLab, AWS CodePipeline, or similar).
Good understanding of data governance, data quality, and metadata frameworks.
Experience supporting microservices architectures with data platforms.
Strong problem-solving skills and ability to optimize complex data workflows.
Excellent communication and stakeholder management skills.
Ability to work in a fast-paced, agile environment and manage multiple priorities.
