Wen Yuan
@wenyuan1
Senior AI and backend engineer building reliable LLM/RAG systems with rigorous evaluation and scalable data pipelines.
What I'm looking for
I’m a Senior AI Engineer focused on turning large language models into intelligent, reliable products. I apply LLM/RAG techniques alongside AI evaluation and safety-minded engineering to deliver consistent, production-ready outcomes.
In recent roles, I’ve built and operationalized evaluation methods for hallucination detection, reasoning quality, edge-case handling, and system-design consistency. At Meta, I designed multi-step agentic prompt workflows for debugging, refactoring, repository reasoning, and code synthesis—an approach aimed at improving correctness and engineering reliability at scale.
Previously, at NetEase, I advanced an enterprise RAG and semantic analytics platform using FastAPI, Airflow, OpenSearch, LlamaIndex, and SQL. I improved retrieval precision@5 by 35%+, reduced manual refresh effort by 70%+, kept common top-k retrieval under 500ms, and raised grounded-answer pass rate from ~70% to 85%+ through dedicated faithfulness and source-grounding evaluations.
Earlier in my career, I built AI-enabled full-stack systems and strong data/automation foundations—delivering asynchronous ingestion and observability that cut debugging time by 50%+ in low-quality scenarios. I’m at my best when combining backend engineering, retrieval/data design, and meticulous evaluation to ship systems users can trust.
Experience
Work history, roles, and key accomplishments
AI Software Evaluation Engineer
Meta
Jan 2026 - Present (5 months)
Evaluating production-scale AI-generated software solutions across Python and TypeScript repositories to improve engineering reliability, architectural consistency, and implementation quality. Designed multi-step agentic prompt workflows and built evaluation methodologies for hallucination detection, reasoning quality, edge cases, and system-design consistency.
AI Evaluation & LLM Quality
Independent
Jan 2023 - Jan 2026 (3 years)
Performed LLM response quality and coding-agent reliability evaluation using prompt engineering, RLHF-style feedback, and rubric-based scoring. Identified failure modes such as hallucinated APIs, weak context usage, unsafe refactors, and flawed reasoning chains, and supported model alignment through structured evaluation datasets.
Senior AI Engineer
NetEase
Jan 2025 - Dec 2025 (11 months)
Built an enterprise RAG and semantic analytics platform with FastAPI, Airflow, Docling, OpenSearch, and LlamaIndex, processing 50K+ document chunks and records. Improved hybrid retrieval precision@5 by 35%+ versus keyword-only baselines and reduced manual refresh effort by 70%+ while keeping common top-k retrieval under 500ms.
AI Full-Stack Engineer
ITLabPro
Jan 2024 - Oct 2024 (9 months)
Delivered an AI-enabled full-stack platform using TypeScript, Node.js, Express.js, MongoDB, JWT, Stripe, and PayPal, supporting 1K+ user/project records. Integrated LLM-powered semantic search and assistance features, reducing manual content lookup and repetitive support workflows by 40%+, and improved AI task completion reliability to 95%+.
Data & Automation Engineer
Creamistry
Feb 2019 - Jul 2022 (3 years 5 months)
Owned analytics automation for Creamistry China franchise operations by consolidating POS, delivery/order, inventory, and sales data into repeatable reporting workflows. Built Python/SQL ETL pipelines that reduced manual Excel-based reporting by 60–70% and created dashboards that reduced ad hoc reporting requests by 40%+.
Software Developer
Huawei
Jul 2015 - Dec 2018 (3 years 5 months)
Contributed to enterprise software systems by developing Java/Kotlin modules, backend integrations, REST API workflows, and authentication logic. Improved reliability and reduced repeated release-blocking issues by 25–30% by implementing request validation, service orchestration, edge-case handling, and cross-service debugging.
Web Developer
ZBJ Network
Feb 2013 - May 2015 (2 years 3 months)
Built backend-focused web modules for a large online services marketplace, including listings, user profiles, order/workflow tracking, admin tooling, and SQL-backed reporting views. Improved marketplace reliability and performance by resolving production issues and optimizing backend routes/queries to improve response times for common workflows by 20%+.
Education
Degrees, certifications, and relevant coursework
City University of Seattle
Master of Science, Computer Science
2023 - 2026
Master of Science in Computer Science (2023–2026), partnering with clients on AI engineering work involving LLM evaluation, retrieval-augmented generation (RAG), semantic search, and reliability testing.
Chongqing University
Bachelor of Business Administration, Marketing
2009 - 2012
Bachelor of Business Administration in Marketing (2009–2012) from Chongqing University.
Tech stack
Software and tools used professionally
GitHub
Kubernetes
GitHub Actions
NumPy
Pandas
MySQL
PostgreSQL
MongoDB
Gmail
Rollout
Node.js
Laravel
Redis
Vue.js
React-Vue
JavaScript
Java
PHP
Kotlin
FastAPI
Linux
PayPal
OpenSearch
Serverless
Airflow
SQL
Hugging Face
LangChain
LlamaIndex
Pinecone
OpenAI API
Cursor
GitHub Copilot
Scale AI
pgvector
Agentic
LangGraph
Docling
Stack AI
Claude Code
Task
Remote
Jan
Android
Availability
Location
Authorized to work in
Portfolio
github.com/WenYuan77Job categories
Skills
Interested in hiring Wen?
You can contact Wen and 90k+ other talented remote workers on Himalayas.
Message WenFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
