HimalayasHimalayas logo
Huzefa  MerchantHM
Open to opportunities

Huzefa Merchant

@huzefamerchant

I'm an ML engineer building deployed multimodal, GenAI and computer vision systems.

India
Message

What I'm looking for

I want to build end-to-end ML systems in multimodal AI, computer vision, and generative AI — trained on real benchmarks, deployed as production services, and solving real problems. Open to global remote roles at startups and product companies.

I'm an ML Engineer focused on Generative AI, Multimodal AI, and Computer Vision, building real systems trained on real benchmarks like Flickr30k and COCO 2017. I rebuilt the same image captioning system 7 times — from InceptionV3+LSTM to ConvNeXt+Perceiver+GPT-2 — because getting it right mattered more than getting it done.

In my flagship LensToWords multimodal captioning pipeline, I designed an end-to-end model (ConvNeXt-Tiny → Perceiver with 16 latents → cross-attention injected into all 12 GPT-2 blocks) and drove it through phased training. I achieved BLEU-4 = 0.1341 on COCO 2017 after training on Flickr30k for architecture validation, then scaling up to COCO.

I also ship deployed ML applications: a Streamlit Movie Recommender System with a production similarity matrix, a FastAPI + Streamlit Potato Disease Detector structured as a deployable two-layer service, and NLP/vision products like the WhatsApp Chat Analyser and Fashion Recommender System. Built through hands-on research and Stanford/DeepLearning.AI certifications, I iterate with diagnosis-first engineering and I'm motivated by measurable outcomes, not demos.

Experience

Work history, roles, and key accomplishments

Self-Directed logoSE
Current

Independent ML Engineer

Jan 2024 - Present (2 years 4 months)

Built and deployed end-to-end ML systems across Generative AI, multimodal AI and computer vision. Flagship project: LensToWords — an image captioning system rebuilt through 7 architecture iterations (InceptionV3+LSTM → ConvNeXt+Perceiver+GPT-2), achieving BLEU-4: 0.1341 on COCO 2017. Shipped 8 public GitHub repositories including deployed applications in recommendation systems, and NLP analytics.

Education

Degrees, certifications, and relevant coursework

MU

Medicaps University

Bachelor of Technology, Mechanical Engineering

2017 - 2021

Grade: 7.2

Activities and societies: Social Club, NSS ( National Service Scheme)

Bachelor of Technology in Mechanical Engineering at Medicaps University, Indore starting in 2017.

DC

DeepLearning.AI (Coursera)

Deep Learning Specialization, Deep Learning

Completed the Deep Learning Specialization on Coursera by DeepLearning.AI.

SC

Stanford University / DeepLearning.AI (Coursera)

Machine Learning Specialization, Machine Learning

Completed the Machine Learning Specialization on Coursera by Stanford University and DeepLearning.AI.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan