Anthony S
@anthonys
Senior Data Engineer with expertise in scalable data solutions.
What I'm looking for
As a Senior Data Engineer with over 8 years of experience, I specialize in building scalable, reliable, and performant data solutions across various domains including retail, social media, and healthcare. My expertise lies in designing both real-time and batch data pipelines using modern technologies such as Apache Spark, Databricks, Snowflake, AWS, and GCP. I have a proven track record of optimizing data infrastructure, significantly reducing costs and latency while enhancing data quality and integrity.
At Walmart Global Tech, I architected a scalable data pipeline that improved reporting latency by 40% for supply chain analytics. I also developed automated data quality validation frameworks that reduced manual audits by over 60%. My experience at Twitter (X) involved engineering high-performance data pipelines that improved data ingestion latency by 35% and migrating legacy systems to cloud-native infrastructures, resulting in a 30% reduction in costs. I am passionate about leveraging emerging AI and ML technologies to drive innovative, data-driven solutions.
Experience
Work history, roles, and key accomplishments
Senior Data Engineer
Walmart Global Tech
Mar 2023 - Present (2 years 4 months)
Architected a scalable data pipeline using Apache Spark (PySpark) on Databricks and AWS Glue, enabling real-time processing of over 5TB/day of transactional data, which improved reporting latency by 40% for supply chain analytics. Designed and implemented a delta lake-based data lakehouse architecture on AWS S3, leveraging Apache Hudi and Databricks SQL, significantly enhancing data freshness and
Senior Data Engineer
Twitter (X)
May 2020 - Feb 2023 (2 years 9 months)
Engineered high-performance data pipelines leveraging Apache Kafka, Flink, and Apache Spark, enabling near real-time analytics and improving data ingestion latency by 35% across global tweet streams. Migrated legacy Hadoop clusters to cloud-native infrastructure on Google Cloud Platform (GCP), utilizing BigQuery, Cloud Composer (Airflow), and Dataflow, resulting in enhanced scalability and a 30% r
Data Engineer
CVS Health
Oct 2017 - Apr 2020 (2 years 6 months)
Built ETL pipelines using Apache Spark, Hive, and Informatica to efficiently process large-scale healthcare datasets, enabling analytics on pharmacy claims and customer behavior. Implemented data warehousing solutions on-premises using Teradata and Oracle, optimizing data models that improved query performance and reduced analytical report generation times by 25%.
Education
Degrees, certifications, and relevant coursework
The University of Kansas
Master's Degree, Computer Science
Completed a Master's Degree in Computer Science. The program provided advanced knowledge and skills in various areas of computer science.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Interested in hiring Anthony?
You can contact Anthony and 90k+ other talented remote workers on Himalayas.
Message AnthonyFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
