Himalayas logo
Robots & PencilsRP

AI Engineer (AI System Calibration & Optimization)

Robots & Pencils is a digital innovation firm that helps clients use mobile, web, and AI technologies to transform their organizations. They focus on developing consumer applications, enterprise software, and education platforms by blending technology with a human-first approach.

Robots & Pencils

Employee count: 51-200

Canada only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

Robots Pencils is seeking an outcome-oriented AI Engineer to partner with a strategic client on a high-impact AI system calibration and optimization engagement. You'll embed directly with the client's AI and product engineering teams to improve the accuracy, reliability, and transparency of their Azure-hosted, fine-tuned GPT model through systematic prompt optimization and RAG calibration.

As an AI Engineer, you'll serve as technical thought partner, actively coding and leveraging your software engineering experience to build calibration pipelines, optimize prompts using prompt optimization frameworks, and establish repeatable improvement workflows. You'll work on-site with the client, driving measurable outcomes that maximize their AI system performance.

Key Responsibilities

Client Engagement Solution Development

  • Embed with strategic client as their technical partner for AI system calibration and prompt optimization.
  • Build production-grade calibration systems using Python within the client's Azure environment.
  • Implement DSPy framework and GEPA optimizer to systematically improve prompt quality and retrieval performance.
  • Design and develop Golden Dataset curation workflows using Azure Data Labeling, establishing gold/silver data tier schemas.
  • Create evaluation frameworks to measure model accuracy, precision/recall, latency, and hallucination rates.
  • Architect prompt optimization pipelines for retrieval, context synthesis, and answer generation tailored to client needs.
  • Own the path to production - evaluation pipelines, Azure ML workflows, KPI dashboards, and optimization automation.
  • Iterate rapidly based on client feedback and KPI results, translate business goals into technical calibration improvements.

Outcome Ownership Business Impact

  • Own end-to-end delivery of calibration systems from initial baseline to production-ready optimization workflows.
  • Establish measurable KPIs and demonstrate accuracy improvements, latency reduction, and hallucination mitigation.
  • Provide strategic guidance on RAG architecture improvements and retrieval parameter optimization.
  • Accelerate client time-to-value through hands-on development and comprehensive knowledge transfer.
  • Deliver operational playbooks and documentation enabling the client team to maintain calibration systems independently.

Engineering Leadership In-Field Delivery Excellence

  • Lead complex, multi-stakeholder calibration initiatives on-site and remotely; drive clarity, remove blockers, and keep execution on track.
  • Set coding standards and architectural patterns for calibration components; write clear docs, runbooks, and technical specifications.
  • Mentor client engineers through code reviews, pairing sessions, and technical workshops on DSPy, GEPA, and evaluation best practices.
  • Make sound tradeoffs under real-world constraints - Azure cost optimization, data quality, performance requirements, and security.
  • Align delivery with Robots Pencils' responsible AI practices and client governance requirements.

Cross-Functional Collaboration

  • Work closely with client's AI SMEs and product engineering teams to understand product catalog structure and validation workflows.
  • Collaborate with internal RP product, engineering, and delivery teams on calibration methodology and best practices.
  • Share insights from client engagement to improve RP's prompt optimization frameworks and tooling.
  • Contribute reusable patterns, evaluation frameworks, and documentation back to RP's core platform.
  • Collaborate across time zones with distributed teams.

Required Skills Qualifications

  • Bachelor's degree in computer science, Engineering, or equivalent experience.
  • 7+ years of professional software development with significant ownership of architecture and delivery.
  • 3+ years of Python in ML/AI systems with a strong focus on data processing and evaluation pipelines.
  • 2+ years building with Generative AI including hands-on prompt engineering and optimization work.
  • Experience with prompt optimization frameworks - DSPy strongly preferred, or similar systematic approaches to prompt improvement.
  • Deep understanding of RAG architectures - retrieval quality, latency/cost tuning, hallucination mitigation, and evaluation methods.
  • Hands-on experience designing evaluation metrics and building assessment frameworks for LLM systems.
  • Knowledge of systematic experimentation methods - A/B testing, parameter tuning, performance benchmarking.
  • Experience with data curation, labeling workflows, and dataset quality management for AI systems.
  • Strong Azure cloud experience with focus on AI/ML services - Azure Machine Learning, Azure AI Search, Azure OpenAI Service.
  • Experience with Azure Data Labeling, Azure Blob Storage, and Azure infrastructure fundamentals.
  • Understanding vector search platforms and retrieval optimization (Azure AI Search, Weaviate, Qdrant, Pinecone).
  • Strong IaC background (Terraform or ARM templates) plus containerization and distributed systems knowledge.
  • Solid SDLC practices - testing strategies, CI/CD, code reviews, observability, and operational excellence.
  • Upper-intermediate English for client communication.
  • Experience leading complex technical projects with multiple stakeholders.
  • Strong communication skills for technical and executive audiences.
  • Ability to context-switch and adapt to client environments.
  • Willingness to travel to client sites.

Nice to Have

  • Direct hands-on experience with DSPy framework and GEPA optimizer.
  • Understanding systematic optimization principles: evolutionary algorithms, Bayesian optimization, multi-objective optimization, and Pareto efficiency concepts.
  • Familiarity with prompt optimization frameworks and methods - experience with any of: MIPROv2, TextGrad, EvoPrompt, AutoPrompt, or reinforcement learning approaches (GRPO, PPO).
  • Experience with LLM-as-judge patterns and automated evaluation pipelines.
  • Knowledge of advanced RAG patterns - Adaptive RAG, Self-RAG, Corrective RAG - and retrieval evaluation methods (MRR, NDCG, precision@k).
  • Understanding of agentic AI patterns - ReAct, Chain-of-Thought, Tool Use - and their application in RAG systems.
  • Experience building evaluation dashboards with Azure Monitor, Application Insights, or similar observability tools.
  • Familiarity with MLOps practices - model versioning, experiment tracking, metric logging for evaluation systems.
  • Experience with AWS or GCP AI/ML platforms (Bedrock, SageMaker, Vertex AI) and cross-cloud architecture patterns.
  • Experience with product catalog systems, cross-reference matching, or e-commerce search optimization.
  • Background in manufacturing, industrial equipment, or technical specification systems.
  • Prior consulting or professional services experience with enterprise clients.

Personal Competencies

  • Accountability – Owns full client engagement cycle with quality, reliability, and attention to detail.
  • Adaptability – Thrives in dynamic, fast-paced client environments.
  • Collaboration – Builds strong partnerships across teams and time zones.
  • Execution-Focused – Delivers maintainable, scalable solutions without overengineering.
  • Innovation-Minded – Brings curiosity and experimentation to technology decisions.
  • Craftsmanship – Cares deeply about documentation and code quality, architecture, and user experience.

Why Join Robots Pencils?

We don’t just ship features, we build digital-first products that matter. As a Senior Forward Deploy Engineer, you’ll join a team that values deep craft, cross-functional collaboration, and relentless focus on quality. You’ll work on impactful agentic AI applications using modern technologies, while influencing engineering culture and best practices across the organization.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Mid-level

Location requirements

Hiring timezones

Canada +/- 0 hours

About Robots & Pencils

Learn more about Robots & Pencils and their company culture.

View company profile

At Robots & Pencils, we are at the forefront of digital innovation, revolutionizing how businesses leverage technology to achieve transformative growth. Since our inception in 2009, we've embraced the then-contrarian view that mobile would be more impactful than the internet itself. This foresight has positioned us as a leader in a world of unprecedented technological acceleration. Our core philosophy blends the sciences with the humanities—the 'Robots' with the 'Pencils'—fusing deep platform expertise with a human-first approach to design and strategy. We empower our clients by developing consumer applications, transformative enterprise software, and pioneering education platforms. Our expertise extends to operationalizing frontier technologies like machine learning, bots, and conversational user interfaces, enabling organizations to gain early and distinct advantages to future-proof their operations.

Our approach is centered on creating a company designed to attract and retain exceptional talent, amassing a team of hyper-skilled individuals who are passionate about solving complex challenges and creating previously inconceivable products. We partner with clients, from startups to global enterprises, to unlock unknown potential through data intelligence and industry experience. We craft user interactions that delight and inspire, always prioritizing the human experience. Our industry-leading, in-house talent meticulously crafts each line of code, delivering on the boldest ideas. We've developed our own groundbreaking products, such as Missions (now Slack's Workflow Builder), and an AI-powered Q&A engine, showcasing our commitment to continuous innovation. By fostering a culture of ongoing experimentation through initiatives like our FunLabs, we ensure our team remains at the leading edge of technology, bringing that cutting-edge knowledge to elevate our clients' digital experiences and technological capabilities. We are dedicated to helping organizations navigate the complexities of the digital landscape, transforming their businesses and creating what's next.

Employee benefits

Learn about the employee benefits and perks provided at Robots & Pencils.

View benefits

401K/RRSP matching

Provides 401K/RRSP matching.

Health & wellness events

Organizes health and wellness events.

R&P branded swag

Provides company branded merchandise.

Flexible work schedule

Prioritizes work-life balance with flexible hours.

View Robots & Pencils's employee benefits
Claim this profileRobots & Pencils logoRP

Robots & Pencils

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

15 remote jobs at Robots & Pencils

Explore the variety of open remote roles at Robots & Pencils, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Robots & Pencils

Remote companies like Robots & Pencils

Find your next opportunity by exploring profiles of companies that are similar to Robots & Pencils. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Robots & Pencils hiring AI Engineer (AI System Calibration & Optimization) • Remote (Work from Home) | Himalayas