We are looking for early-career Software Engineers to join our team in Vancouver, BC, in a specialized role sitting at the intersection of high-performance computing (HPC) and Large Language Model (LLM) engineering.
Requirements
- Performance Benchmarking: Run and automate standard LLM quality benchmarks (GSM8K, MMLU) alongside custom performance suites for specific workloads (e.g., long-context window, KV cache reuse).
- Infrastructure Validation: Create automated acceptance tests for new GPU clusters across x86 and ARM systems, measuring GPU memory bandwidth, networking throughput, and multi-node networking performance.
- Model Dev Experience: Develop and maintain internal GPU-enabled development environments (similar to GitHub Codespaces).
- Tool Development: Build and contribute to tools such as InferenceMAX and genai-bench to automate model evaluation and optimization.
- Deep Hardware Profiling: Use PyTorch Profiler and NVIDIA Nsight Systems to collect performance profiles, identify bottlenecks, and debug the NVIDIA compute/networking stack.
- Monitoring & Observability: Develop real-time dashboards and alerts to monitor system health, model startup times, and runtime performance.
- Continuous Integration: Automate performance testing via CI/CD pipelines to catch regressions in model setups before they hit production.
- Optimization Automation: Build tools to find the "Pareto frontier"—identifying the absolute best configuration (latency vs. cost vs. quality) for a given model and workload.
Benefits
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
