HimalayasHimalayas logo
JPMorgan Chase & Co.JC

Distributed Training & Performance Engineer - Vice President

JPMorgan Chase & Co. is a prominent global financial services firm dedicated to innovation and service for over 225 years.

JPMorgan Chase & Co.

Employee count: 5000+

United States only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

Are you looking for an exciting opportunity to join a dynamic and growing team in a fast paced and challenging area? This is a unique opportunity for you to work with Global Technology Applied Research (GTAR) center at JPMorganChase. The goal of GTAR is to design and conduct research across multiple frontier technologies, in order to enable novel discoveries and inventions, and to inform and develop next-generation solutions for the firm’s clients and businesses.

As a senior-level engineer within The Global Technology Applied Research (GTAR), you will design, optimize, and scale large-model pretraining workloads across hyperscale accelerator clusters. This role sits at the intersection of distributed systems, kernel-level performance engineering, and large-scale model training. The ideal candidate can take a fixed hardware budget (accelerator type, node topology, interconnect, and cluster size) and design efficient, stable, and scalable training strategy, spanning parallelism layout, memory strategy, kernel optimization, and end-to-end system performance. This is a hands-on role with direct impact on training throughput, efficiency, and cost at scale.

Job responsibilities

  • Design and optimize distributed training strategies for large-scale models, including data, tensor, pipeline, context parallelism.
  • Manage end-to-end training performance: from data input pipelines through model execution, communication, and checkpointing.
  • Identify and eliminate performance bottlenecks using systematic profiling and performance modeling.
  • Develop or optimize high-performance kernels using CUDA, Triton, or equivalent frameworks.
  • Design and optimize distributed communication strategies to maximize overlap between computation and inter-node data movement.
  • Design memory-efficient training configurations (caching, optimizer sharding, checkpoint strategies).
  • Evaluate and optimize training on multiple accelerator platforms, including GPUs and non-GPU accelerators.
  • Contribute towards incorporating performance improvements back to internal pipelines.

Required qualifications, capabilities, and skills

  • Master’s degree with 3+ years of industry experiences, or Ph.D. degree with 1+ years of industry experience in computer science, physics, math, engineering or related fields.
  • Engineering experience at top AI labs, HPC centers, chip vendors, or hyperscale ML infra teams.
  • Strong experience designing and operating large-scale distributed training jobs across multinode accelerator clusters.
  • Deep understanding of distributed parallelism strategies: data parallelism, tensor/model parallelism, pipeline parallelism, and memory/optimizer sharding.
  • Proven ability to profile and optimize training performance using industry standard tools such as Nsight, PyTorch profiler, or equivalent.
  • Hands-on experience with GPU programming and kernel optimization.
  • Strong understanding of accelerator memory hierarchies, bandwidth limitations, and compute-communication tradeoffs.
  • Experience with collective communication libraries and patterns (e.g., NCCL-style collectives).
  • Proficiency in Python for ML systems development and C++ for performance-critical components.
  • Experience with modern ML frameworks such as PyTorch or JAX in large-scale training settings.

Preferred qualifications, capabilities, and skills

  • Experience optimizing training workloads on non-GPU accelerators (e.g., TPU, or wafer-scale architectures).
  • Familiarity with compiler-driven ML systems (e.g., XLA, MLIR, Inductor) and graph-level optimizations.
  • Experience designing custom fused kernels or novel execution strategies for attention or large matrix operations.
  • Strong understanding of scaling laws governing large-model pretraining dynamics and stability considerations.
  • Contributions to open-source ML systems, distributed training frameworks, or performance-critical kernels.
  • Prior experience collaborating directly with hardware vendors or accelerator teams.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Education

Postgraduate degree

Experience

1 year minimum

Location requirements

Hiring timezones

United States +/- 0 hours

About JPMorgan Chase & Co.

Learn more about JPMorgan Chase & Co. and their company culture.

View company profile

At JPMorgan Chase, we have a rich history that dates back over 225 years. The firm serves millions of customers, clients, and communities in more than 100 markets around the world. Our commitment to exceptional service, innovation, and sustainable growth is reflected in every aspect of our operations. We pride ourselves on being a leader in investment banking, financial services, and asset management.

Our operations are built on foundational values that strengthen our company and the communities we serve. We are committed to creating inclusive economies and fostering financial health, supporting business growth, and empowering individuals through innovative financial solutions. Our global reach allows us to combine our financial expertise with impactful community engagement, helping to address economic challenges and drive systemic change for a sustainable future.

Claim this profileJPMorgan Chase & Co. logoJC

JPMorgan Chase & Co.

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

14 remote jobs at JPMorgan Chase & Co.

Explore the variety of open remote roles at JPMorgan Chase & Co., offering flexible work options across multiple disciplines and skill levels.

View all jobs at JPMorgan Chase & Co.

Remote companies like JPMorgan Chase & Co.

Find your next opportunity by exploring profiles of companies that are similar to JPMorgan Chase & Co.. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan