David Chou
@davidchou
Infrastructure-focused Staff Software Engineer specializing in global load balancing, failover orchestration, and production observability.
What I'm looking for
I’m an infrastructure-focused Staff Software Engineer with deep expertise in global load balancing, failover orchestration, distributed tracing, and production-scale load testing at Meta. I own technical strategy and multi-year roadmapping to support Meta-scale availability and capacity planning.
At Meta, I lead cross-functional initiatives across disaster recovery, site reliability, capacity, and traffic infrastructure teams. I drive architectural decisions and technical problem-solving for organization-spanning challenges, including automated failover evolution and advanced traffic orchestration under extreme failure scenarios.
I act as a primary technical backstop and influencer, scoping high-ambiguity projects and partnering with EMs/PMs to define priorities that ladder into broader infrastructure objectives. I also elevate team performance through deep mentorship of senior engineers and by setting and enforcing engineering excellence standards.
I champion CI/CD, observability, and load testing strategies at org scale, helping reduce recovery times and increase confidence in production changes while maintaining Meta’s high-reliability bar. Earlier, I led development of Canopy, Meta’s distributed performance tracing system, spanning frontend visualization, backend instrumentation, and trace aggregation at massive scale.
Experience
Work history, roles, and key accomplishments
Own technical strategy and multi-year roadmap for global user traffic management and disaster recovery systems, aligning mitigations with long-term resilience goals for large-scale availability and capacity planning. Lead cross-functional architectural decisions around automated failover evolution, traffic orchestration, and CI/CD/observability/load testing to reduce recovery times.
Technical lead for a disaster recovery organization team owning end-to-end global user traffic management systems for site reliability and capacity. Designed automated traffic balancing, failover orchestration, traffic shifting tools, and production load testing frameworks, while driving technical strategy and CI/CD pipeline evolution.
Led development of Canopy, Meta's distributed end-to-end performance tracing system, including internal visualization tools, sampling policy configuration interfaces, and trace data aggregation backends. Built instrumentation APIs/libraries and contributed full-stack capabilities across frontend (React/JavaScript/CSS) and backend services (Hack/PHP) while maintaining scalable, low-latency trace ag
Education
Degrees, certifications, and relevant coursework
University of California, Berkeley
Bachelor of Science, Computer Science
2009 - 2013
Bachelor of Science in Computer Science at the University of California, Berkeley from 2009 to 2013.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring David?
You can contact David and 90k+ other talented remote workers on Himalayas.
Message DavidGet matched with your dream remote job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
