HimalayasHimalayas logo
FA

AI Researcher — Distillation

Featherless AI
CA, CN + 3 more

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

About the Role

We’re looking for an AI Researcher focused on model distillation to help us push the frontier of efficient, high-performance models. You’ll work on turning large, expensive models into smaller, faster, and more deployable systems—while maintaining or improving quality.

This role is ideal for someone who enjoys publishing research, working close to real systems, and seeing their ideas move from papers → code → production.

What You’ll Work On

  • Design and evaluate model distillation techniques (teacher–student training, self-distillation, layer-wise distillation, representation matching, etc.)

  • Research tradeoffs between model size, latency, memory, and accuracy

  • Develop novel distillation approaches for:

    • Large language models

    • Long-context or specialized architectures

    • Inference-constrained environments

  • Run large-scale experiments and ablations; analyze results rigorously

  • Collaborate with engineers to productionize research outcomes

  • Write and submit research papers to top-tier venues (NeurIPS, ICML, ICLR, COLM, etc.)

  • Contribute to internal research notes, technical blogs, and open-source projects when appropriate

What We’re Looking For

Required

  • Strong background in machine learning research

  • Hands-on experience with model distillation or closely related topics (compression, pruning, quantization, representation learning)

  • Publication experience (conference or journal papers, workshop papers, or arXiv preprints)

  • Solid understanding of deep learning fundamentals (optimization, training dynamics, generalization)

  • Fluency in PyTorch (or equivalent) and research-grade experimentation

  • Ability to clearly communicate research ideas, results, and limitations

Nice to Have

  • Experience distilling large language models

  • Work on efficiency-focused research (latency, memory, throughput)

  • Experience with long-context models or non-Transformer architectures

  • Open-source contributions in ML or research tooling

  • Prior startup or applied research experience

Why Join Us

  • Real ownership over research direction at a Series A stage

  • Strong support for publishing and open research

  • Tight feedback loop between research and real-world deployment

  • Access to meaningful compute and production-scale problems

  • Small, highly technical team with deep ML and systems expertise

Example Backgrounds

  • ML researchers from academia transitioning to industry

  • Research engineers with published work in model efficiency

  • PhD / Post-doc graduates or industry researchers who still want to publish

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Location requirements

Hiring timezones

United States +/- 0 hours, and 4 other timezones
Claim this profileFA

Featherless AI

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

20 remote jobs at Featherless AI

Explore the variety of open remote roles at Featherless AI, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Featherless AI

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan