About the Role

We’re looking for an ML Engineer focused on training optimization to help us scale and improve large-scale model training. You’ll work at the intersection of research and production, optimizing training pipelines for speed, stability, and cost—while collaborating closely with researchers pushing model architecture and capability forward.

This is a high-impact role with real ownership: your work directly affects how fast we can iterate, how large we can scale, and how efficiently we deploy new models.

What You’ll Do

Optimize large-scale model training pipelines (throughput, convergence, stability, and cost)
Improve distributed training strategies (data, model, and pipeline parallelism)
Tune optimizers, schedulers, batch sizing, and precision (bf16 / fp16 / fp8)
Reduce training time and compute cost via profiling, bottleneck analysis, and systems-level improvements
Collaborate with researchers on architecture-aware training strategies
Build and maintain robust training infrastructure (checkpointing, fault tolerance, reproducibility)
Evaluate and integrate new training techniques (e.g. gradient checkpointing, ZeRO, FSDP, custom kernels)
Own training performance metrics and continuously push them forward

What We’re Looking For

Strong experience training large neural networks (LLMs or similarly large models)
Hands-on experience with training optimization (not just model usage)
Solid understanding of:
- Backpropagation, optimization algorithms, and training dynamics
- Distributed systems for ML training
Experience with PyTorch (required)
Comfort working close to hardware (GPUs, memory, networking constraints)
Ability to move fluidly between research ideas and production-ready code

Nice to Have

Experience with large-scale distributed training (multi-node, multi-GPU)
Familiarity with DeepSpeed, FSDP, Megatron, or custom training stacks
Experience optimizing training on AMD or NVIDIA GPUs
Contributions to open-source ML infrastructure or research codebases
Exposure to non-Transformer architectures (RNNs, hybrid models, etc.)

Why Join Us

Real ownership at Series-A stage — your work shapes the company’s trajectory
Work on cutting-edge models and training systems at scale
Small, highly technical team with fast feedback loops
Strong emphasis on engineering quality and research rigor
Competitive compensation + meaningful equity

Machine Learning Engineer — Training Optimization

About the Role

What You’ll Do

What We’re Looking For

Solid understanding of:

Distributed systems for ML training

Experience with PyTorch (required)

Nice to Have

Why Join Us

Apply now

About the job

Apply before

Posted on

Job type

Experience level

Location requirements

Hiring timezones

Job categories

Skills

About Featherless AI

Apply now

About the job

Apply before

Posted on

Job type

Experience level

Location requirements

Hiring timezones

Job categories

Skills

Featherless AI

Similar remote jobs

Senior Machine Learning Engineer (Applications)

Senior Machine Learning Engineer for Research Team

ML Engineer

Data & AI Engineer

Machine Learning Engineer - Document Intelligence & Applied GenAI

Applied AI Engineer - Agent GYM

19 remote jobs at Featherless AI

Developer Relations Associate/Intern (Partnerships) Paris-Based

Developer Relations Associate/Intern (Partnerships) Berlin-Based

Founding Account Executive (AI Cloud)

Business Development Rep (AI Cloud)

Business Development Rep (AI Cloud)

Developer Relations Associate/Intern (Partnerships) Boston-Based

Find your dream job

Find your dream job

Apply now

Apply now

Senior Machine Learning Engineer (Applications)

Senior Machine Learning Engineer for Research Team

ML Engineer

Data & AI Engineer

Machine Learning Engineer - Document Intelligence & Applied GenAI

Applied AI Engineer - Agent GYM

Developer Relations Associate/Intern (Partnerships) Paris-Based

Developer Relations Associate/Intern (Partnerships) Berlin-Based

Founding Account Executive (AI Cloud)

Business Development Rep (AI Cloud)

Business Development Rep (AI Cloud)

Developer Relations Associate/Intern (Partnerships) Boston-Based

Find your dream job