pavan User
@pavanuser2
Experienced Sr. Data Engineer specializing in big data technologies.
What I'm looking for
With over 8 years of professional experience in IT, I have honed my skills as a Sr. Data Engineer, focusing on big data technologies such as Hadoop and Spark. My expertise lies in architecting scalable data pipelines and optimizing data processing capabilities, which has significantly enhanced operational efficiency in my current role at T-Mobile.
I have a proven track record of implementing data security measures, automating ETL processes, and developing advanced analytics models. My proficiency in various programming languages, including Python, Java, and Scala, allows me to create robust data applications and streamline workflows. I thrive in collaborative environments, working closely with cross-functional teams to deliver tailored data solutions that drive business success.
Experience
Work history, roles, and key accomplishments
Sr. Data Engineer
T-Mobile
May 2023 - Present (2 years)
Architected scalable data pipelines using Hadoop and Spark, processing terabytes of telecom data daily. Conducted performance tuning of Hadoop and Spark jobs, developed advanced analytics models, and automated data workflows using Apache Airflow and Google Cloud services.
Sr. Data Engineer
T-Mobile
May 2023 - Present (2 years)
Architected scalable data pipelines using Hadoop, Spark, and Pyspark, processing terabytes of telecom data daily. Conducted performance tuning of Hadoop and Spark jobs, developed advanced analytics models, and managed complex ETL workflows using various tools and technologies.
Sr. Data Engineer
TD Bank
Dec 2021 - May 2023 (1 year 5 months)
Optimized Hadoop and Spark jobs for efficient data processing and developed robust ETL processes with Informatica. Leveraged GCP services for scalable data analytics and built interactive dashboards using Tableau and Power BI.
Data Engineer
Legato Health Technologies
May 2019 - Aug 2021 (2 years 3 months)
Worked on Hadoop ecosystem and big data technologies, developing ETL processes with PySpark and managing NoSQL databases. Implemented data warehousing solutions and collaborated with data scientists for optimized reporting.
Sr. Data Engineer
TD Bank
Dec 2021 - May 2023 (1 year 5 months)
Optimized Hadoop and Spark jobs for efficient data processing, developed ETL processes with Informatica, and utilized Google Cloud services for scalable data analytics. Created automated data workflows and dashboards using Tableau and Power BI.
Data Engineer
Legato Health Technologies
May 2019 - Aug 2021 (2 years 3 months)
Worked on Hadoop ecosystem and big data technologies, developed ETL processes with PySpark, and managed NoSQL databases. Implemented data sharing solutions in Snowflake and collaborated with data scientists for reporting needs.
Hadoop Developer
New Fold Digital
May 2017 - May 2019 (2 years)
Developed and maintained a Data Lake using Hadoop technologies, created ETL jobs for data migration, and monitored production jobs. Engaged in data integration programs and collaborated with business teams to meet project requirements.
Education
Degrees, certifications, and relevant coursework
T-Mobile
Sr. Data Engineer, Data Engineering
2023 -
Architected scalable data pipelines using Hadoop, Spark, and Pyspark, efficiently processing terabytes of telecom data daily, resulting in enhanced data processing capabilities. Developed advanced analytics and machine learning models using PySpark, enabling real-time data processing and predictive insights in telecom operations.
T-Mobile
Sr. Data Engineer, Data Engineering
2023 -
Architected scalable data pipelines using Hadoop, Spark, and Pyspark, efficiently processing terabytes of telecom data daily, resulting in enhanced data processing capabilities. Developed advanced analytics and machine learning models using PySpark, enabling real-time data processing and predictive insights in telecom operations.
TD Bank
Sr. Data Engineer, Data Engineering
2021 - 2023
Conducted performance tuning and optimization of Hadoop and Spark jobs, ensuring efficient data processing and reducing job runtimes. Developed robust ETL processes with Informatica, extracting, transforming, and loading data from various sources into Snowflake and other data warehouses.
Legato Health Technologies
Data Engineer, Data Engineering
2019 - 2021
Worked on latest developments in Hadoop ecosystem and big data technologies, exploring emerging trends such as Apache Spark, Apache Flink, and Apache Kafka to continuously enhance data processing capabilities. Developed and managed ETL processes with PySpark, ensuring efficient data extraction, transformation, and loading across various data sources and sinks.
TD Bank
Sr. Data Engineer, Data Engineering
2021 - 2023
Conducted performance tuning and optimization of Hadoop and Spark jobs, ensuring efficient data processing and reducing job runtimes. Developed robust ETL processes with Informatica, extracting, transforming, and loading data from various sources into Snowflake and other data warehouses.
Legato Health Technologies
Data Engineer, Data Engineering
2019 - 2021
Worked on latest developments in Hadoop ecosystem and big data technologies, exploring emerging trends such as Apache Spark, Apache Flink, and Apache Kafka to continuously enhance data processing capabilities. Developed and managed ETL processes with PySpark, ensuring efficient data extraction, transformation, and loading across various data sources and sinks.
New Fold Digital
Hadoop Developer, Hadoop Development
2017 - 2019
Developing and maintaining a Data Lake containing regulatory data for federal reporting with big data technologies such as Hadoop Distributed File System (HDFS), Apache Impala, Apache Hive and Cloudera distribution. Developing different ETL jobs to extract data from different data sources like Oracle, Microsoft SQL Server, transform the extracted data using Hive Query Language (HQL) and load it in
New Fold Digital
Hadoop Developer, Hadoop Development
2017 - 2019
Developing and maintaining a Data Lake containing regulatory data for federal reporting with big data technologies such as Hadoop Distributed File System (HDFS), Apache Impala, Apache Hive and Cloudera distribution. Involved in importing the data from different sources into HDFS using sqoop and applying transformations using Hive.
Tech stack
Software and tools used professionally
Azure HDInsight
Azure Synapse
Apache Spark
Apache Flink
Apache Hive
Talend
Google Cloud Platform
Google Cloud Storage
GitHub
GitLab
Bitbucket
Kubernetes
Jenkins
PySpark
DBeaver
dbt
DB
Sqoop
MySQL
PostgreSQL
MongoDB
Microsoft SQL Server
MariaDB
Couchbase
Cassandra
Hadoop
HBase
Gmail
Yarn
Databricks
Microsoft Teams
Redis
Terraform
Azure DevOps
Java
Apache Flume
Kafka
Apache NiFi
Cloud Firestore
Ubuntu
CentOS
Linux
Windows
Windows Server
Google Cloud Dataflow
Serverless
Google Cloud Functions
Azure Functions
Azure SQL Database
Google Cloud SQL
Airflow
Apache Beam
Google BigQuery
SQL
Azure Cosmos DB
Azure Blob Storage
Apache Impala
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring pavan?
You can contact pavan and 50k+ other talented remote workers on Himalayas.
Message pavanFind your dream job
Sign up now and join over 85,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
