Zhuwei Xu
@zhuweixu
Machine learning and data science intern focused on multimodal RAG, low-latency inference, and LLM systems.
What I'm looking for
I’m a data science student training for real-world impact, bringing hands-on experience in building end-to-end ML systems—from retrieval pipelines to deployment-minded inference. In research work, I designed and deployed a multimodal RAG pipeline that unifies face, scene, and speech signals for cross-modal retrieval.
I’ve also built systems with strict performance goals, including a face recognition pipeline (RetinaFace + FaceNet) that achieved <800ms inference latency per 1-minute video. For semantic search over video, I implemented indexing with Whisper (large-v3-turbo) and Milvus vector database to support natural language queries with relevance-based ranking.
Alongside production-style research, I develop and evaluate LLM frameworks like SMOLSolver—where I fine-tuned a Phi-2 generator with LoRA and trained RoBERTa verifier/ranker models to improve Pass@k—plus multi-agent workflows using MetaGPT with human-in-the-loop refinement.
Experience
Work history, roles, and key accomplishments
Machine Learning Research Intern
Vitongue Technology Co., Ltd.
Mar 2025 - May 2025 (2 months)
Designed and deployed an end-to-end multimodal RAG pipeline for long-form videos, enabling cross-modal retrieval over a unified shared timeline. Built a face recognition embedding pipeline (RetinaFace + FaceNet) and semantic indexing with Whisper + Milvus, achieving <800ms inference latency per 1-minute video.
Data Scientist Intern
Shanghai Heze Jiuxi Private Equity Fund Management Co., Ltd.
Jun 2023 - Aug 2023 (2 months)
Backtested predictive equity factors across 5–10 years of data covering 3,000+ stocks using pandas, identifying top-performing signals for portfolio construction. Built an automated daily pipeline to scrape market data, generate features, and load structured datasets into databases, reducing manual data prep from hours to minutes.
Education
Degrees, certifications, and relevant coursework
New York University
Master of Science in Data Science, Data Science
2025 -
Grade: GPA: 3.67/4.00
Activities and societies: Relevant coursework: Fundamentals of Natural Language Processing; Probability and Statistics for Data Science.
Pursuing an M.S. in Data Science at New York University (expected May 2027), with coursework in NLP fundamentals and probability/statistics for data science.
Fudan University
Bachelor of Arts in Philosophy of Science and Logic, Philosophy of Science and Logic
2020 - 2025
Grade: GPA: 3.50/4.00
Activities and societies: Minors: Artificial Intelligence & Big Data; Statistics. Relevant coursework: Machine Learning, Data Mining, Database Systems, Multivariate & Categorical Data Analysis.
Completed a B.A. in Philosophy of Science and Logic at Fudan University (Sep 2020–Jun 2025), with minors in AI & Big Data and Statistics.
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Zhuwei?
You can contact Zhuwei and 90k+ other talented remote workers on Himalayas.
Message ZhuweiFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
