Skip to main content
HimalayasHimalayas logo
HZ
Open to opportunities

Hengkai Zheng

@hengkaizheng

Software engineer and ML builder focused on shipping AI pipelines and measurable impact.

United States
Message

What I'm looking for

I’m looking for a role where I can build end-to-end ML/AI systems and production data pipelines—turning real-world data into searchable, measurable products, with room to iterate fast and grow with strong engineering mentorship.

I’m a software engineering intern turned AI/ML builder, combining rigorous ML training with production-minded data pipelines. I’ve focused on turning messy, high-volume inputs into structured systems that support real-time search, analytics, and downstream use.

At Attencity, I worked as a Sole Developer to build web scrapers for Xiaohongshu, TikTok, and Douyin using Python Pandas, processing 3,000+ creator profiles/hour. I designed and maintained a PostgreSQL ingestion pipeline to store and query scraped creator data, enabling real-time search and bulk export.

In my projects, I’ve built hands-on AI systems with measurable evaluation: an Interactive Character Companion Agent using a 5-step stateful dialogue pipeline on Qwen2.5-1.5B-Instruct, with a two-tier memory system and FAISS semantic retrieval. I also curated a 4,793-sample SFT dataset across emotion categories and policy labels to train state-conditioned tone variation.

I’ve also implemented deep learning components from scratch to understand the “why” behind performance—building ResNet-34 + ArcFace metric learning, reporting EER, ROC-AUC, and TAR@FPR (achieving 2.55% EER on a held-out private leaderboard). Alongside that, I built a NumPy-based “MyTorch” framework with forward/backward passes, autograd, attention, RNN/GRU/CTC, and optimizers (SGD/Adam/AdamW).

Experience

Work history, roles, and key accomplishments

AT

Software Engineering Intern

Attencity

Jul 2024 - Dec 2024 (5 months)

Built Python (Pandas) web scrapers for Xiaohongshu, TikTok, and Douyin, processing 3,000+ creator profiles/hour and curating structured tables with automated deduplication. Designed and maintained a PostgreSQL ingestion pipeline to enable real-time search, analytics, and bulk export of scraped creator data for downstream business use.

Education

Degrees, certifications, and relevant coursework

Carnegie Mellon University logoCU

Carnegie Mellon University

Master of Information Systems Management, Information Systems Management

Pursuing a Master of Information Systems Management at Carnegie Mellon University, with coursework including Deep Learning, Machine Learning, Artificial Intelligence, Generative AI, and Distributed Systems.

The Ohio State University logoTU

The Ohio State University

Bachelor of Science in Computer and Information Science, Computer and Information Science

Earned a Bachelor of Science in Computer and Information Science from The Ohio State University.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan