Essential Functions:
- Identify, evaluate, and prioritize opportunities to integrate AI into existing and new organizational processes and tooling.
- Design and implement data structures, metadata models, naming conventions, and data standards to support AI and machine learning initiatives.
- Support onboarding new AI projects by identifying data requirements, assessing data readiness, and defining ingestion and preparation workflows.
- Develop processes to measure and improve ongoing data quality, consistency, completeness, and accuracy across AI datasets and workflows.
- Build and maintain data preparation pipelines to support model training, testing, retrieval, and operational AI applications.
- Collaborate with technical and business stakeholders across divisions to understand data sources, use cases, and operational constraints.
- Create and maintain documentation for data standards, transformation logic, onboarding procedures, and quality controls.
- Partner with AI, software, platform, and security teams to ensure data workflows are scalable, secure, and aligned with organizational objectives.
Experience and Skills Required:
- Bachelor’s degree in computer science, Data Engineering, Information Systems, Engineering, Mathematics, or a related STEM field 8-10 years of engineering experience
- 3+ years of experience supporting AI, natural language processing, RAG, and related solutions.
- Experience preparing and transforming data for analytics, machine learning, search, or AI-enabled applications.
- Experience designing vector databases and retrieval pipelines.
- Experience developing MCP Servers and Clients.
- Understanding of data quality management practices, including validation, normalization, deduplication, and error handling.
- Experience working with AI platforms, Kubernetes, and cloud AI environments (Azure Foundry, AWS Bedrock).
- Strong experience developing AI solutions in Python, Golang, or Typescript.
- Strong analytical, troubleshooting, and documentation skills.
Preferred:
- Experience with data platforms and tooling such as Pandas, Spark, Airflow, dbt, or similar ecosystems.
- Familiarity with vector databases, embeddings pipelines, chunking strategies, and retrieval-augmented generation workflows.
- Experience designing data schemas, taxonomies, ontologies, or metadata standards for enterprise information.
- Experience working in regulated environments with standards such as NIST or CMMC.
- Experience supporting scientific, engineering, defense, or national security-related data initiatives.
- Experience working with DevOps systems, git, CI/CD, GitOps.
- DoD experience.
- Active Secret clearance preferred, or ability to obtain and maintain a Secret clearance.
Education:
- Bachelor’s degree in CS, Software Engineering or other IT-related field or equivalent experience
REMOTE WORK NOTICE: This position may be performed fully remote, hybrid, or onsite at an ARA office. Preference will be given to candidates located onsite in the Albuquerque, NM and Raleigh, NC area.
