Skip to main content
HimalayasHimalayas logo
Zhuwei XuZX
Open to opportunities

Zhuwei Xu

@zhuweixu

Machine learning and data science intern focused on multimodal RAG, low-latency inference, and LLM systems.

United States
Message

What I'm looking for

I’m looking for an ML/Data Science role where I can build production-minded retrieval, ranking, and LLM pipelines—especially multimodal RAG with real-time inference—and grow with teams that value rigorous experimentation and impact.

I’m a data science student training for real-world impact, bringing hands-on experience in building end-to-end ML systems—from retrieval pipelines to deployment-minded inference. In research work, I designed and deployed a multimodal RAG pipeline that unifies face, scene, and speech signals for cross-modal retrieval.

I’ve also built systems with strict performance goals, including a face recognition pipeline (RetinaFace + FaceNet) that achieved <800ms inference latency per 1-minute video. For semantic search over video, I implemented indexing with Whisper (large-v3-turbo) and Milvus vector database to support natural language queries with relevance-based ranking.

Alongside production-style research, I develop and evaluate LLM frameworks like SMOLSolver—where I fine-tuned a Phi-2 generator with LoRA and trained RoBERTa verifier/ranker models to improve Pass@k—plus multi-agent workflows using MetaGPT with human-in-the-loop refinement.

Experience

Work history, roles, and key accomplishments

VL

Machine Learning Research Intern

Vitongue Technology Co., Ltd.

Mar 2025 - May 2025 (2 months)

Designed and deployed an end-to-end multimodal RAG pipeline for long-form videos, enabling cross-modal retrieval over a unified shared timeline. Built a face recognition embedding pipeline (RetinaFace + FaceNet) and semantic indexing with Whisper + Milvus, achieving <800ms inference latency per 1-minute video.

SL

Data Scientist Intern

Shanghai Heze Jiuxi Private Equity Fund Management Co., Ltd.

Jun 2023 - Aug 2023 (2 months)

Backtested predictive equity factors across 5–10 years of data covering 3,000+ stocks using pandas, identifying top-performing signals for portfolio construction. Built an automated daily pipeline to scrape market data, generate features, and load structured datasets into databases, reducing manual data prep from hours to minutes.

Education

Degrees, certifications, and relevant coursework

New York University logoNU

New York University

Master of Science in Data Science, Data Science

2025 -

Grade: GPA: 3.67/4.00

Activities and societies: Relevant coursework: Fundamentals of Natural Language Processing; Probability and Statistics for Data Science.

Pursuing an M.S. in Data Science at New York University (expected May 2027), with coursework in NLP fundamentals and probability/statistics for data science.

Fudan University logoFU

Fudan University

Bachelor of Arts in Philosophy of Science and Logic, Philosophy of Science and Logic

2020 - 2025

Grade: GPA: 3.50/4.00

Activities and societies: Minors: Artificial Intelligence & Big Data; Statistics. Relevant coursework: Machine Learning, Data Mining, Database Systems, Multivariate & Categorical Data Analysis.

Completed a B.A. in Philosophy of Science and Logic at Fudan University (Sep 2020–Jun 2025), with minors in AI & Big Data and Statistics.

Tech stack

Software and tools used professionally

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan