Skip to main content
HimalayasHimalayas logo
PJ
Open to opportunities

Philip John

@philipjohn1

Senior data platform engineer building scalable cloud-native pipelines and reliable high-throughput data products.

United States
Message

What I'm looking for

I’m looking to lead data platform and distributed pipeline work—improving reliability, latency, and cost efficiency with Python/Spark and cloud-native infrastructure, backed by strong observability and automation.

I’m a Senior Data Platform Engineer with 8+ years of experience designing scalable cloud-native data platforms, distributed data pipelines, and infrastructure automation across financial services, healthcare, and telecommunications. I specialize in Python, SQL, Spark-based processing, Kubernetes orchestration, and cloud data architecture, with a track record of delivering high-throughput systems processing 20+ TB/day while improving reliability, performance, and cost efficiency.

At Capital One, I designed an enterprise data platform supporting 200+ data products, built pipelines with Python/PySpark, Kafka, and Delta Lake, and reduced end-to-end latency from 8 hours to under 40 minutes. I also implemented reusable ingestion frameworks, CI/CD using Terraform and GitHub Actions, and observability with Prometheus and Grafana to reduce MTTR by 60%, while optimizing AWS infrastructure to save $1.2M annually. Previously at UnitedHealth Group and AT&T, I built HIPAA-compliant pipelines, migrated ETL workflows to Snowflake/Airflow/dbt, improved data accuracy to 99.8%, orchestrated 600+ workflows, reduced cloud warehouse costs by 38%, and modernized Hadoop workloads to Spark-based cloud architecture.

Experience

Work history, roles, and key accomplishments

CO
Current

Senior Data Engineer

Capital One

Oct 2023 - Present (2 years 8 months)

Designed an enterprise data platform supporting 200+ data products across analytics and risk domains. Built Python/PySpark/Kafka/Delta Lake pipelines processing 25+ TB/day, reducing end-to-end latency from 8 hours to under 40 minutes, and cut MTTR by 60% via Prometheus/Grafana.

AT

Data Engineer

AT&T

Oct 2017 - Feb 2020 (2 years 4 months)

Built ingestion pipelines processing 5B+ telecom events per month and developed Python ETL services for transformation and enrichment. Migrated Hadoop workloads to Spark-based cloud architecture and improved query performance by 60% using indexing and partitioning, with monitoring dashboards for pipeline health.

Education

Degrees, certifications, and relevant coursework

University of Texas at Dallas logoUD

University of Texas at Dallas

Bachelor of Science, Computer Science

2013 - 2017

Earned a Bachelor of Science in Computer Science at the University of Texas at Dallas from 2013 to 2017.

AA

Amazon Web Services (AWS)

AWS Certified Solutions Architect – Associate, Cloud Architecture

AWS Certified Solutions Architect – Associate covering AWS architecture and related best practices.

DA

Databricks

Databricks Data Engineer Associate, Data Engineering

Databricks Data Engineer Associate certification focused on building and operationalizing data engineering pipelines on Databricks.

Linux Foundation logoLF

Linux Foundation

Certified Kubernetes Administrator (CKA), Kubernetes Administration

Certified Kubernetes Administrator (CKA) certification validating Kubernetes administration skills.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan