DC

Open to opportunities

David Chou

@davidchou

Infrastructure-focused Staff Software Engineer specializing in global load balancing, failover orchestration, and production observability.

What I'm looking for

I’m looking for infrastructure engineering roles where I can own multi-year roadmaps for global reliability—disaster recovery, traffic orchestration, and distributed tracing—pairing automation and observability with deep mentorship to improve recovery time and production confidence.

I’m an infrastructure-focused Staff Software Engineer with deep expertise in global load balancing, failover orchestration, distributed tracing, and production-scale load testing at Meta. I own technical strategy and multi-year roadmapping to support Meta-scale availability and capacity planning.

At Meta, I lead cross-functional initiatives across disaster recovery, site reliability, capacity, and traffic infrastructure teams. I drive architectural decisions and technical problem-solving for organization-spanning challenges, including automated failover evolution and advanced traffic orchestration under extreme failure scenarios.

I act as a primary technical backstop and influencer, scoping high-ambiguity projects and partnering with EMs/PMs to define priorities that ladder into broader infrastructure objectives. I also elevate team performance through deep mentorship of senior engineers and by setting and enforcing engineering excellence standards.

I champion CI/CD, observability, and load testing strategies at org scale, helping reduce recovery times and increase confidence in production changes while maintaining Meta’s high-reliability bar. Earlier, I led development of Canopy, Meta’s distributed performance tracing system, spanning frontend visualization, backend instrumentation, and trace aggregation at massive scale.

Experience

Work history, roles, and key accomplishments

ME

Current

Staff Software Engineer

Current

Feb 2021 - Present (5 years 5 months)

Own technical strategy and multi-year roadmap for global user traffic management and disaster recovery systems, aligning mitigations with long-term resilience goals for large-scale availability and capacity planning. Lead cross-functional architectural decisions around automated failover evolution, traffic orchestration, and CI/CD/observability/load testing to reduce recovery times.

Python Java Thrift JavaScript React Hack (Meta PHP)Disaster Recovery Global Traffic Orchestration Observability Load Testing Capacity Planning CI CD Automation Technical Strategy Roadmapping Mentorship

ME

Senior Software Engineer

Aug 2016 - Feb 2021 (4 years 6 months)

Technical lead for a disaster recovery organization team owning end-to-end global user traffic management systems for site reliability and capacity. Designed automated traffic balancing, failover orchestration, traffic shifting tools, and production load testing frameworks, while driving technical strategy and CI/CD pipeline evolution.

Kafka Python Java Thrift JavaScript React Hack (Meta PHP)C Apache Spark SQL NoSQL Hadoop Airflow Disaster Recovery Load Testing CI CD Pipelines Technical Strategy Mentorship

ME

Software Engineer

Oct 2013 - Aug 2016 (2 years 10 months)

Led development of Canopy, Meta's distributed end-to-end performance tracing system, including internal visualization tools, sampling policy configuration interfaces, and trace data aggregation backends. Built instrumentation APIs/libraries and contributed full-stack capabilities across frontend (React/JavaScript/CSS) and backend services (Hack/PHP) while maintaining scalable, low-latency trace ag

JavaScript React CSS Hack (Meta PHP)C Trace Aggregation Instrumentation Libraries APIs Full Stack

Education

Degrees, certifications, and relevant coursework

UB

University of California, Berkeley

Bachelor of Science, Computer Science

2009 - 2013

Bachelor of Science in Computer Science at the University of California, Berkeley from 2009 to 2013.

Tech stack

Software and tools used professionally

Apache Spark

Hadoop

Gmail

Node.js

JavaScript

Java

PHP

Kafka

Airflow

SQL

Interested in hiring David?

You can contact David and 90k+ other talented remote workers on Himalayas.

People also viewed

View all talent

Get matched with your dream remote job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!