Skip to main content
TF
Open to opportunities

Terry Fu

@terryfu

Senior platform and infrastructure engineer building reliable multi-cloud distributed systems and developer platforms to boost deployment speed and efficiency.

Zimbabwe
Message

What I'm looking for

I’m looking to lead infrastructure and platform work for data-intensive systems—improving reliability, deployment velocity, observability, and cost efficiency—by partnering closely with engineering and SRE teams.

I specialize in large-scale distributed systems, cloud infrastructure, and internal developer platforms, with a consistent focus on reliability, deployment velocity, and infrastructure cost efficiency for data-intensive workloads. At Databricks, I built a multi-cloud workspace provisioning platform that cut setup time from ~3 hours to ~25 minutes and enabled standardized deployments across 15+ regions.

Beyond provisioning, I engineered Kubernetes-based control plane services for Spark cluster lifecycle orchestration (improving startup success from 96% to 99.5%), delivered multi-region high availability with faster incident recovery (~45 minutes to ~8 minutes), and drove compute cost optimization (~22% reduction) while maintaining performance SLAs. I’ve also built observability for provisioning and platform services with Prometheus, Grafana, OpenTelemetry, and centralized logging to improve incident detection (~50%) and reduce MTTR (~40 minutes to ~18 minutes).

Experience

Work history, roles, and key accomplishments

Databricks logoDA
Current

Senior Software Engineer

Feb 2024 - Present (2 years 4 months)

Built a multi-cloud Databricks workspace provisioning platform with Terraform, Python, AWS/Azure APIs, and CI/CD, cutting environment setup from ~3 hours to ~25 minutes across 15+ regions. Engineered Kubernetes-based Spark cluster lifecycle orchestration and multi-region HA, improving startup success from 96% to 99.5% and reducing recovery time from ~45 minutes to ~8 minutes during incidents.

Airtable logoAI

Software Engineer (Infra)

Mar 2022 - Feb 2024 (1 year 11 months)

Built and operated a multi-cluster Kubernetes platform on AWS EKS supporting 70+ clusters across 3 regions, enabling ~4 production releases per week per service. Designed a multi-cluster reconciliation control plane and led an EC2 to containerized EKS migration, reducing deployment lead time from ~70 minutes to <10 minutes.

LinkedIn logoLI

Staff Software Engineer

Nov 2017 - Mar 2022 (4 years 4 months)

Built and evolved an internal cloud control-plane (Nuage) for self-service provisioning and lifecycle management of LinkedIn Data Infrastructure resources. Delivered platformization primitives and governance (quotas, approvals, ownership metadata) while improving observability and incident-response workflows for a distributed control plane.

Oracle logoOR

Senior Software Engineer

Aug 2012 - Oct 2017 (5 years 2 months)

Built control plane APIs for Oracle Cloud Infrastructure Compute Classic using Java and REST services to automate compute provisioning and lifecycle management. Implemented golden-image replication and recovery with Solaris Unified Archive and ZFS, and improved high-availability orchestration with Oracle Solaris Cluster technologies.

Education

Degrees, certifications, and relevant coursework

University of California, Berkeley logoUB

University of California, Berkeley

EECS

2010 - 2012

Earned a Master's degree in EECS at the University of California, Berkeley from 2010 to 2012.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan