Sagar R
@sagarr
Senior Data Engineer building scalable pipelines and ML systems on cloud.
What I'm looking for
I’m a Senior Data Engineer with 3.5+ years building scalable data pipelines and ML systems across top Indian fintech environments. I’ve driven over $300K in annual infrastructure savings through cost optimization, performance tuning, and operational discipline.
I specialize in PySpark/Apache Spark and real-time streaming architectures, with hands-on work using Apache Kafka, Debezium CDC, and ClickHouse to cut time-series query latency from 10 minutes to 500ms. I’ve also built ML solutions like a CNN+LSTM churn prediction model with SHAP explainability, improving user retention by 17% for a wealth management platform.
Cloud-native execution is a constant theme in my work—Snowflake, AWS services, and lakehouse patterns with Delta Lake and Parquet for governance, reliability, and faster analytics. I bring a strong engineering-to-outcomes mindset: optimizing Snowflake spend by 49%, improving query response times by 30% with schema/materialized views, and designing migrations and monitoring to keep SLAs steady while teams scale.
Experience
Work history, roles, and key accomplishments
Senior Software Engineer — Data & ML
Dezerv
Mar 2025 - Present (1 year 3 months)
Owned end-to-end design and implementation of a CNN+LSTM churn prediction model with SHAP explainability, driving a 17% increase in user retention on the wealth management platform. Architected and deployed a real-time Kafka/Debezium/ClickHouse pipeline, cutting time-series query latency from 10 minutes to 500 ms and reducing monthly Snowflake spend 49% ($5,500 to $2,800).
Software Engineer — Data Engineering
CoinDCX
Mar 2024 - Mar 2025 (1 year)
Led decommissioning of Confluent Kafka and migrated to a self-managed AWS MSK cluster, saving approximately $110K/year in licensing and operational costs. Re-architected the LakeTrade data model into microservices within one month, improving deployment speed, and integrated PagerDuty with automated recovery workflows to reduce MTTR by 40%.
Software Engineer — Data Engineering
Jumbotail
Nov 2022 - Mar 2024 (1 year 4 months)
Designed, implemented, and productionized a Spark Structured Streaming CDC system using PySpark and Debezium, processing 100M+ events/day with exactly-once guarantees and built alerting plus failure recovery mechanisms. Reduced annual query infrastructure cost by 31% ($36K to $25K) with Trino autoscaling, and cut compute time by 98% via PySpark optimizations (partition pruning and predicate pushdo
Co-Founder & Product Lead
Alephs360
Jul 2021 - Oct 2022 (1 year 3 months)
Developed and maintained Python/SQL ETL pipelines for financial data, ensuring data accuracy and timely delivery. Implemented data quality checks and monitoring with AWS CloudWatch, reducing data reconciliation errors by 25%, and optimized AWS S3 storage and query performance with partitioning/lifecycle policies to cut storage costs by 15%.
Education
Degrees, certifications, and relevant coursework
National Institute of Technology, Kurukshetra
Bachelor of Technology
2017 - 2021
Completed a B.Tech degree at National Institute of Technology, Kurukshetra from 2017 to 2021.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Sagar?
You can contact Sagar and 90k+ other talented remote workers on Himalayas.
Message SagarFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
