pavani dachepalli
@pavanidachepalli
Senior Data Engineer specializing in scalable Spark pipelines and cloud data platforms.
What I'm looking for
I am a Senior Data Engineer with extensive experience developing scalable data pipelines and analytics solutions using Apache Spark and cloud platforms like AWS and GCP.
I have architected and migrated legacy data systems to modern cloud environments, built robust ETL pipelines with PySpark on EMR, implemented real-time ingestion with Kafka and Kinesis, and automated orchestration using Airflow and Oozie.
I deliver high-performance data infrastructure, integrate BI solutions such as Tableau and Power BI, implement data quality and CI/CD practices, and aim to drive innovation and operational efficiency in data engineering roles.
Experience
Work history, roles, and key accomplishments
Senior Data Engineer
Wawa, Inc.
Dec 2024 - Present (8 months)
Developed scalable PySpark ETL pipelines on Amazon EMR to ingest and transform data from MySQL, PostgreSQL, MongoDB, SFTP and APIs; built real-time ingestion with Kafka and Kinesis and implemented Lambda-based event-driven processing to enable timely analytics.
Senior Data Engineer
HSBC
Apr 2020 - Aug 2023 (3 years 4 months)
Built and deployed scalable PySpark and Spark SQL ETL on cloud platforms and migrated legacy on-premise data warehouse to Azure SQL Data Warehouse; designed real-time streaming pipelines using Kafka, Spark Streaming and Azure Event Hub for near-real-time analytics.
Designed and implemented ETL pipelines on GCP using Cloud Dataflow, Data Fusion and BigQuery to enable scalable ingestion and analytics; migrated on-premises ETL to GCP and built Spark Streaming pipelines from Kafka for real-time processing.
Education
Degrees, certifications, and relevant coursework
Rivier University
Master of Science, Computer Science
Completed Master of Science in Computer Science at Rivier University.
Tech stack
Software and tools used professionally
Matillion
Apache Spark
Druid
Talend
Google Cloud Platform
GitHub
Jenkins
NumPy
Pandas
PySpark
Sqoop
MySQL
PostgreSQL
MongoDB
Microsoft SQL Server
Couchbase
Cassandra
Hadoop
HBase
Gmail
Yarn
Databricks
Terraform
Java
PowerShell
scikit-learn
NLTK
Kafka
Zookeeper
Linux
Windows
Elasticsearch
Microsoft Excel
Airflow
Time Analytics
Amazon EMR
SQL
SciPy
Delta Lake
Great Expectations
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring pavani?
You can contact pavani and 90k+ other talented remote workers on Himalayas.
Message pavaniFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
