Do you love turning complex, messy information into clean, connected data that others can build on? Are you energized by creating systems that help teams discover what matters and move faster? If you enjoy working across engineering, research, and product teams to shape reliable, scalable data foundations—and you’re excited by the opportunity to build something new from the ground up—this role offers the chance to make a meaningful impact.
Role overview
As a Data Engineer, you’ll build and refine the pipelines, data models, and services that make Microsoft’s research content discoverable and usable across modern applications and emerging AI scenarios. You’ll define the architecture and core data systems largely from scratch, creating the foundation for both web and AI-powered experiences. You’ll use existing AI models and out-of-the-box Microsoft tools to turn unstructured content into structured, high-quality data—not heavy ML research, but practical, model-assisted enrichment.
You’ll partner with a PM and a full stack engineer while independently driving backend and data direction. The work includes an initial infrastructure lift to establish the environment, followed by iterative development as the MVP grows.
The contract and working with 2A
This role is an Embedded Consultant position, which means you’ll sit directly with Microsoft’s internal team to offer hands-on support. This remote, full-time contract runs for 6–18 months with the potential for renewal.
While you’ll spend most of your days working directly with your Microsoft team,
2A offers multiple points of connection to support you in your role and career. We’ll meet with you and your manager regularly to help ensure a successful engagement, and you’ll have opportunities to join
2A working groups, attend community-building events, and engage in professional development.
Activities
Build and maintain core data pipelines
- Build and maintain end-to-end ingestion pipelines for documents, datasets, code repositories, videos, transcripts, and internal knowledge sources.
- Clean, normalize, structure, and store data in formats that support both web applications and AI-driven use cases.
- Use “out of the box” Microsoft tools—such as Fabric, Azure services, Cosmos DB, or Copilot Studio—to create reliable, maintainable systems.
Enrich and model research data
- Use AI models to transform unstructured content into structured metadata and durable knowledge assets.
- Design the architecture and foundational data systems, establishing the patterns and infrastructure for a new, scalable environment.
- Develop and refine embeddings, vector indexes, and retrieval components to support semantic search and grounding scenarios.
Build backend and data services
- Build data services, APIs, and backend components that power internal applications and agent-supported workflows.
- Iterate on systems after the initial MVP, improving reliability, performance, and scalability over time.
Collaborate and translate requirements
- Collaborate with a PM and full stack engineer to understand requirements and translate them into actionable data solutions.
- Work cross-functionally to define data needs and align systems with downstream consumers and discovery workflows.
Skills and qualifications - need to have
- Proven ability to design and build end-to-end data systems, from ingestion through cleaning, structuring, storage, and serving.
- Experience building and shipping data products that deliver practical value.
- Demonstrated impact using AI models in data workflows (applied use, not ML research).
- 5+ years of software or data engineering experience, including at least 2 years of hands-on work with data pipelines.
- Comfortable defining architecture and starting systems from scratch, working independently in a small cross-functional team.
- Proficiency in Python, SQL, or similar languages used in data engineering workflows.
Skills and qualifications - nice to have
- Experience with Microsoft Fabric, Cosmos DB, Azure data services, or Copilot Studio.
- Background building data that supports embeddings, semantic search, or retrieval use cases.
- Familiarity with metadata frameworks, taxonomies, or knowledge modeling.
- Experience shaping ambiguous information into structured datasets and iterating quickly after an MVP.
Next steps
See yourself in the job description? Apply! If your skills and experience match the role, a recruiter will reach out to schedule a phone screen.
About 2A Recruiting & Staffing
We help tech companies build future-ready teams by connecting them with forward-thinking marketing, creative, and technical professionals. With deep client networks across the tech industry we give candidates access to roles they won’t find on public job boards and the guidance to stand out. We take a people-first approach—showing up with care, responsiveness, and support that continues well past placement, backed by benefits designed to help our contract team members thrive.
We’re committed to anti-racism and building a diverse team by hiring individuals who bring different perspectives, and we know we still have work to do. If there’s anything
2A can do to create a more comfortable or accessible application process for you, please let us know.
2A is proud to be an equal opportunity employer. Candidates from diverse backgrounds are strongly encouraged to apply. All qualified applicants will be considered without regard to race, color, religion, sex, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law.