Ritika Lahariya
@ritikalahariya
Data engineer turning scalable ETL into reliable, cloud-native pipelines for actionable insights.
What I'm looking for
I’m a highly motivated Data Engineer with 5 years of experience building scalable data pipelines and optimizing ETL workflows. I’m skilled in SQL, Python, Spark, and cloud platforms like AWS and GCP, and I enjoy turning raw data into actionable insights that help teams make smarter decisions.
At SSr Adv Globant (nov 2024 - present), I built a scalable data pipeline that processes usage logs stored in S3, with workflows managed using Airflow DAGs written in Python. I used Apache Spark on AWS EKS for processing and aggregations at scale, storing results as partitioned Parquet files in S3 and date-partitioned tables in Snowflake and Iceberg. I also created a flexible ingestion layer using MongoDB, AWS Step Functions, and Lambda, and implemented OOP-based Python scripts to compare user profiles against business rules and automatically trigger personalized product suggestions.
Earlier, at L1 Publicis Sapient (Feb 2024 - nov 2024), I collaborated with cross-functional teams—data scientists, software engineers, analysts, and business stakeholders—to align data solutions with business needs. I designed and implemented a supply chain management data ingestion pipeline using GitLab CI/CD, Apache Airflow, Python, and Google BigQuery, and I troubleshoot and debug data-related issues while implementing fixes to prevent recurrence.
Before that, at Infosys Ltd. (2020 - Feb 2024) as a Specialist Programmer, I designed and built ETL pipelines using Scala Spark for transformations and scheduled them with Airflow DAGs using Python as the scripting language. I ingested data from multiple sources, performed transformations before loading into data warehouses like BigQuery, and worked across GCP core services such as Google Cloud Storage, Google Dataproc, and Airflow—following Agile engineering practices to deliver reliably.
Experience
Work history, roles, and key accomplishments
Built a scalable pipeline to process usage logs from S3 using Airflow Python DAGs and Apache Spark on AWS EKS. Designed modular, cloud-native ingestion using Step Functions/Lambda and generated partitioned Parquet outputs with downstream analytics in Snowflake and Iceberg.
Collaborated with cross-functional teams to define data requirements and align ingestion solutions with business needs. Designed and implemented a supply chain data ingestion pipeline using GitLab CI/CD, Apache Airflow, Python, and Google BigQuery, and resolved data issues to prevent recurrence.
Designed and built ETL pipelines using Scala and Apache Spark for transformation, scheduled with Python-based Apache Airflow DAGs. Ingested data from multiple sources and loaded transformed datasets into Google BigQuery and other storage systems using GCP services like Cloud Storage and Dataproc.
Education
Degrees, certifications, and relevant coursework
Ritika hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Interested in hiring Ritika?
You can contact Ritika and 90k+ other talented remote workers on Himalayas.
Message RitikaFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
