HimalayasHimalayas logo
TF
Open to opportunities

Terry Fu

@terryfu

Senior platform and infrastructure engineer building reliable multi-cloud distributed systems and developer platforms to boost deployment speed and efficiency.

Zimbabwe
Message

What I'm looking for

I’m looking to lead infrastructure and platform work for data-intensive systems—improving reliability, deployment velocity, observability, and cost efficiency—by partnering closely with engineering and SRE teams.

I specialize in large-scale distributed systems, cloud infrastructure, and internal developer platforms, with a consistent focus on reliability, deployment velocity, and infrastructure cost efficiency for data-intensive workloads. At Databricks, I built a multi-cloud workspace provisioning platform that cut setup time from ~3 hours to ~25 minutes and enabled standardized deployments across 15+ regions.

Beyond provisioning, I engineered Kubernetes-based control plane services for Spark cluster lifecycle orchestration (improving startup success from 96% to 99.5%), delivered multi-region high availability with faster incident recovery (~45 minutes to ~8 minutes), and drove compute cost optimization (~22% reduction) while maintaining performance SLAs. I’ve also built observability for provisioning and platform services with Prometheus, Grafana, OpenTelemetry, and centralized logging to improve incident detection (~50%) and reduce MTTR (~40 minutes to ~18 minutes).

Experience

Work history, roles, and key accomplishments

Databricks logoDA
Current

Senior Software Engineer

Feb 2024 - Present (2 years 2 months)

Built a multi-cloud Databricks workspace provisioning platform with Terraform, Python, AWS/Azure APIs, and CI/CD, cutting environment setup from ~3 hours to ~25 minutes across 15+ regions. Engineered Kubernetes-based Spark cluster lifecycle orchestration and multi-region HA, improving startup success from 96% to 99.5% and reducing recovery time from ~45 minutes to ~8 minutes during incidents.

Airtable logoAI

Software Engineer (Infra)

Mar 2022 - Feb 2024 (1 year 11 months)

Built and operated a multi-cluster Kubernetes platform on AWS EKS supporting 70+ clusters across 3 regions, enabling ~4 production releases per week per service. Designed a multi-cluster reconciliation control plane and led an EC2 to containerized EKS migration, reducing deployment lead time from ~70 minutes to <10 minutes.

LinkedIn logoLI

Staff Software Engineer

Nov 2017 - Mar 2022 (4 years 4 months)

Built and evolved an internal cloud control-plane (Nuage) for self-service provisioning and lifecycle management of LinkedIn Data Infrastructure resources. Delivered platformization primitives and governance (quotas, approvals, ownership metadata) while improving observability and incident-response workflows for a distributed control plane.

Oracle logoOR

Senior Software Engineer

Aug 2012 - Oct 2017 (5 years 2 months)

Built control plane APIs for Oracle Cloud Infrastructure Compute Classic using Java and REST services to automate compute provisioning and lifecycle management. Implemented golden-image replication and recovery with Solaris Unified Archive and ZFS, and improved high-availability orchestration with Oracle Solaris Cluster technologies.

Education

Degrees, certifications, and relevant coursework

University of California, Berkeley logoUB

University of California, Berkeley

EECS

2010 - 2012

Earned a Master's degree in EECS at the University of California, Berkeley from 2010 to 2012.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan