Mehmood Ghojaria
@mehmoodghojaria
Senior data engineer specializing in scalable AWS streaming analytics and governed data platforms that power machine learning and BI.
What I'm looking for
I’m a Senior Data Engineer with extensive experience building scalable AWS cloud data platforms that support streaming analytics, machine learning, and business intelligence. I design robust ETL/ELT pipelines using Python, SQL, Kafka, Spark, Airflow, AWS Glue, and Snowflake—focused on reliability in production.
Across enterprise environments, I implement real-time streaming architectures with Kafka, Kinesis, Lambda, and distributed processing frameworks. I also build and optimize curated datasets, API-enabled data services, and governed data warehouses—supporting analysts, stakeholders, partners, and decision-making.
I strengthen data trust through governance, validation frameworks, monitoring, and automated alerting systems. I pair that with CI/CD automation and Infrastructure-as-Code (Terraform and related tooling) to deliver secure, compliant, cost-optimized cloud data architectures that consistently drive measurable outcomes.
Experience
Work history, roles, and key accomplishments
Designed scalable streaming pipelines using Kafka, Spark, AWS services, and Python to support real-time enterprise financial risk evaluation. Built ETL/ELT workflows, curated datasets, API-enabled services, and Snowflake/Synapse solutions with data quality validation, monitoring, and automated alerting.
Developed scalable ETL/ELT pipelines with AWS Glue, Python, Spark, Lambda, and SQL for enterprise analytical processing. Built streaming ingestion with Kafka and Kinesis, implemented Airflow/Step Functions orchestration, and delivered Snowflake/Redshift warehousing with monitoring, validation, and CI/CD automation.
Built cloud-native ETL pipelines using Dataflow, Apache Beam, Python, and SQL for enterprise pharmaceutical analytics. Implemented streaming with Pub/Sub and Dataflow, delivered governed BigQuery/warehouse architectures, and automated monitoring, validation, and alerting for regulatory compliance.
Designed enterprise streaming pipelines using Kafka, Spark, Airflow, and Hadoop for real-time transaction analytics and monitoring. Developed ETL frameworks for structured and unstructured datasets, delivered Snowflake/Redshift/Hive warehouse solutions, and implemented monitoring and validation using Prometheus, Grafana, and CloudWatch.
Big Data Engineer
ACT Fibernet
Jan 2016 - Mar 2018 (2 years 2 months)
Developed Hadoop and Spark processing pipelines for large-scale distributed analytics and operational reporting. Implemented streaming ingestion with Kafka and Spark Streaming, built ETL workflows with Hive/Pig/PySpark/SQL, and automated orchestration with Airflow and Oozie including schema optimization and data quality frameworks.
Built Hadoop-based ETL processing frameworks using MapReduce, Spark, Hive, and SQL for enterprise analytical and operational reporting. Implemented ingestion with Sqoop/Flume, orchestrated pipelines with Oozie and scripts, and delivered optimized Hive schemas with data quality validation and monitoring.
Education
Degrees, certifications, and relevant coursework
Westcliff University
Master of Information Technology, Information Technology
Completed a Master’s in Information Technology at Westcliff University.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Mehmood?
You can contact Mehmood and 90k+ other talented remote workers on Himalayas.
Message MehmoodFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
