Open to opportunities

Sagar Saru

@sagarsaru

Message

Experienced Data Engineer with a passion for scalable data solutions.

United States

Message

What I'm looking for

I am looking for a role that challenges my skills and offers opportunities for growth in data engineering and cloud technologies.

I am an experienced Data Engineer with over 7 years of expertise in designing and implementing scalable data solutions and architectures. My proficiency in optimizing data pipelines and leveraging cloud technologies has enabled me to handle large-scale data processing effectively. I have a strong background in building and maintaining ETL/ELT pipelines using tools such as Azure Data Factory, AWS Glue, and Apache Airflow, ensuring efficient data ingestion and transformation across enterprise systems.

Throughout my career, I have successfully architected real-time data streaming and event-driven architectures using technologies like Apache Kafka and AWS Kinesis. My hands-on experience with cloud platforms, including Azure, AWS, and GCP, has equipped me with the skills to implement high-performance data warehousing solutions. I am adept at collaborating with cross-functional teams to provide clean, structured, and reliable data for analytics and decision-making, ultimately driving business success.

Experience

Work history, roles, and key accomplishments

Current

Senior Data Engineer

Current

Aflac

Feb 2022 - Present (4 years 5 months)

Designed and built scalable ETL pipelines using Azure Data Factory, integrating structured and unstructured data from various sources. Developed and optimized Azure Synapse Analytics-based data warehouse solutions, implementing materialized views, indexing strategies, and partitioning to enhance query performance.

Azure Data Factory Azure Synapse Databricks Azure Blob Storage Azure DevOps Delta Lake Terraform Python SQL

Data Engineer

Everlane

Jul 2019 - Dec 2021 (2 years 5 months)

Built and optimized AWS Glue ETL pipelines that efficiently extracted, transformed, and loaded data into Amazon Redshift. Designed and managed Amazon Redshift clusters, implementing distribution keys, sort keys, and workload management (WLM) strategies to optimize query execution.

AWS Glue Amazon Redshift Amazon S3 AWS Lambda Apache Kafka AWS Kinesis Terraform Apache Airflow PostgreSQL Athena Tableau

Big Data Developer

Omnicare

Jan 2018 - Jun 2019 (1 year 5 months)

Designed and implemented Hadoop-based data pipelines using Apache Spark and Hive, processing petabyte-scale structured and unstructured batch data. Created and maintained AWS S3-based data lakes, defining partitioning strategies and storage classes, optimizing costs while ensuring high availability and scalability.

Apache Spark Hadoop Hive Amazon S3 Athena Apache NiFi AWS Kinesis DynamoDB MongoDB Kubernetes Elasticsearch Kibana Python Tableau

Education

Degrees, certifications, and relevant coursework

Texas Tech University

Bachelor of Science, Computer Science and Mathematics

Studied core principles of computer science, including programming, data structures, and algorithms. Also covered fundamental concepts in mathematics, such as calculus, linear algebra, and discrete math.