Khateja Begum
@khatejabegum
Senior Data Engineer building cloud-native, real-time data pipelines that power reliable analytics at enterprise scale.
What I'm looking for
I’m a Senior Data Engineer with 9+ years building cloud-native, scalable data platforms across Healthcare, Telecom, Retail, and Banking. I design and operate high-throughput batch and real-time pipelines using Apache Spark, Kafka, Airflow, and Hadoop on AWS and Azure, with a strong focus on performance optimization, data reliability, and production-ready delivery.
In my current role, I built and maintained AWS and hybrid pipelines integrating data from 20+ hospital systems while maintaining 99.9% data accuracy. I’ve also migrated 10+ on-prem data platforms to cloud (AWS, Azure, and Snowflake), improved pipeline performance by up to 75%, and supported enterprise data models across 100+ subject areas—along with governance and quality improvements that reduced recurring data issues by ~40%.
Experience
Work history, roles, and key accomplishments
Sr. Data Engineer
AssistRx
Sep 2021 - Present (4 years 8 months)
Built and maintained cloud-based AWS/hybrid data pipelines integrating data from 20+ hospital systems while maintaining 99.9% data accuracy. Designed real-time Kafka + Spark Streaming pipelines and migrated legacy ETL to AWS Glue/Spark, improving performance by ~40% and reducing manual data validation/transforms by ~70%.
Data Engineer
Lumen Technologies
Apr 2019 - Aug 2021 (2 years 4 months)
Designed and tuned enterprise data platforms on Teradata, Hadoop, and AWS, improving complex Amazon Redshift query performance by up to 100x. Built and maintained ETL pipelines (DataStage, Informatica, Python) and supported real-time analytics by integrating AWS Lambda ingestion with Hive-based reporting dashboards.
Sr. Data Modeler / Data Analyst
Workforce
Sep 2016 - Mar 2019 (2 years 6 months)
Designed enterprise EDW and data mart models using Inmon and Kimball methodologies, supporting HR analytics, payroll reporting, and regulatory compliance. Automated Oracle-to-Snowflake ingestion with Python, reducing ETL runtimes by ~25%, and managed SCD Type I/II/III plus SSRS reporting for workforce KPIs.
Data Analyst
Optum
May 2014 - Aug 2016 (2 years 3 months)
Built logical and physical healthcare analytics data models and designed star/snowflake schemas in ERwin to support BI, regulatory, and compliance reporting. Developed SSIS/Informatica ETL workflows, authored T-SQL data quality enforcement, and tuned Teradata/SQL Server queries to improve report refresh performance.
Education
Degrees, certifications, and relevant coursework
Northeastern University
Master of Science in Project Management, Project Management
Earned a Master of Science in Project Management at Northeastern University.
Deccan College of Engineering and Technology
Bachelor of Engineering in Computer Engineering, Computer Engineering
Earned a Bachelor of Engineering in Computer Engineering at Deccan College of Engineering and Technology.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Khateja?
You can contact Khateja and 90k+ other talented remote workers on Himalayas.
Message KhatejaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
