Open to opportunities

pavan User

@pavanuser2

Message

Experienced Sr. Data Engineer specializing in big data technologies.

India

Message

What I'm looking for

I am seeking a challenging role that fosters innovation and collaboration, with opportunities for professional growth and impactful projects.

With over 8 years of professional experience in IT, I have honed my skills as a Sr. Data Engineer, focusing on big data technologies such as Hadoop and Spark. My expertise lies in architecting scalable data pipelines and optimizing data processing capabilities, which has significantly enhanced operational efficiency in my current role at T-Mobile.

I have a proven track record of implementing data security measures, automating ETL processes, and developing advanced analytics models. My proficiency in various programming languages, including Python, Java, and Scala, allows me to create robust data applications and streamline workflows. I thrive in collaborative environments, working closely with cross-functional teams to deliver tailored data solutions that drive business success.

Experience

Work history, roles, and key accomplishments

Current

Sr. Data Engineer

Current

T-Mobile

May 2023 - Present (3 years 3 months)

Architected scalable data pipelines using Hadoop and Spark, processing terabytes of telecom data daily. Conducted performance tuning of Hadoop and Spark jobs, developed advanced analytics models, and automated data workflows using Apache Airflow and Google Cloud services.

Hadoop Spark Kafka SQL Python Java ETL Airflow

Current

Sr. Data Engineer

Current

T-Mobile

May 2023 - Present (3 years 3 months)

Architected scalable data pipelines using Hadoop, Spark, and Pyspark, processing terabytes of telecom data daily. Conducted performance tuning of Hadoop and Spark jobs, developed advanced analytics models, and managed complex ETL workflows using various tools and technologies.

Hadoop Spark PySpark ETL Google Cloud Platform Kafka Python Java SQL Airflow

Sr. Data Engineer

TD Bank

Dec 2021 - May 2023 (1 year 5 months)

Optimized Hadoop and Spark jobs for efficient data processing, developed ETL processes with Informatica, and utilized Google Cloud services for scalable data analytics. Created automated data workflows and dashboards using Tableau and Power BI.

Hadoop Spark Informatica ETL SQL Python Tableau Power BI Google Cloud Platform Airflow

Sr. Data Engineer

TD Bank

Dec 2021 - May 2023 (1 year 5 months)

Optimized Hadoop and Spark jobs for efficient data processing and developed robust ETL processes with Informatica. Leveraged GCP services for scalable data analytics and built interactive dashboards using Tableau and Power BI.

Hadoop Spark Informatica Google Cloud Platform ETL Tableau Power BI Python SQL Airflow

Data Engineer

Legato Health Technologies

May 2019 - Aug 2021 (2 years 3 months)

Worked on Hadoop ecosystem and big data technologies, developing ETL processes with PySpark and managing NoSQL databases. Implemented data warehousing solutions and collaborated with data scientists for optimized reporting.

Hadoop PySpark NoSQL ETL Azure Snowflake Power BI Tableau SQL Airflow

Data Engineer

Legato Health Technologies

May 2019 - Aug 2021 (2 years 3 months)

Worked on Hadoop ecosystem and big data technologies, developed ETL processes with PySpark, and managed NoSQL databases. Implemented data sharing solutions in Snowflake and collaborated with data scientists for reporting needs.

Hadoop PySpark Snowflake ETL SQL Power BI Azure Docker Kubernetes Airflow

Hadoop Developer

New Fold Digital

May 2017 - May 2019 (2 years)

Developed and maintained a Data Lake using Hadoop technologies, created ETL jobs for data extraction and transformation, and implemented data migration processes using AWS and Jenkins.

Hadoop ETL AWS Python Scala Spark Hive SQL Git Jenkins

Hadoop Developer

New Fold Digital

May 2017 - May 2019 (2 years)

Developed and maintained a Data Lake using Hadoop technologies, created ETL jobs for data migration, and monitored production jobs. Engaged in data integration programs and collaborated with business teams to meet project requirements.

Hadoop ETL Spark Python Scala Hive AWS Git Jenkins

Education

Degrees, certifications, and relevant coursework

T-Mobile

Sr. Data Engineer, Data Engineering

2023 -

Architected scalable data pipelines using Hadoop, Spark, and Pyspark, efficiently processing terabytes of telecom data daily, resulting in enhanced data processing capabilities. Developed advanced analytics and machine learning models using PySpark, enabling real-time data processing and predictive insights in telecom operations.

T-Mobile

Sr. Data Engineer, Data Engineering

2023 -

TD Bank

Sr. Data Engineer, Data Engineering

2021 - 2023

Conducted performance tuning and optimization of Hadoop and Spark jobs, ensuring efficient data processing and reducing job runtimes. Developed robust ETL processes with Informatica, extracting, transforming, and loading data from various sources into Snowflake and other data warehouses.

Legato Health Technologies

Data Engineer, Data Engineering

2019 - 2021

Worked on latest developments in Hadoop ecosystem and big data technologies, exploring emerging trends such as Apache Spark, Apache Flink, and Apache Kafka to continuously enhance data processing capabilities. Developed and managed ETL processes with PySpark, ensuring efficient data extraction, transformation, and loading across various data sources and sinks.

TD Bank

Sr. Data Engineer, Data Engineering

2021 - 2023

Legato Health Technologies

Data Engineer, Data Engineering

2019 - 2021

New Fold Digital

Hadoop Developer, Hadoop Development

2017 - 2019

Developing and maintaining a Data Lake containing regulatory data for federal reporting with big data technologies such as Hadoop Distributed File System (HDFS), Apache Impala, Apache Hive and Cloudera distribution. Developing different ETL jobs to extract data from different data sources like Oracle, Microsoft SQL Server, transform the extracted data using Hive Query Language (HQL) and load it in

New Fold Digital

Hadoop Developer, Hadoop Development

2017 - 2019

Developing and maintaining a Data Lake containing regulatory data for federal reporting with big data technologies such as Hadoop Distributed File System (HDFS), Apache Impala, Apache Hive and Cloudera distribution. Involved in importing the data from different sources into HDFS using sqoop and applying transformations using Hive.