NellNetwork is seeking LLM Ops Engineers and ML Ops Engineers to join a growing AI/ML team. This role focuses on developing, deploying, and maintaining scalable infrastructure and pipelines for Machine Learning (ML) models and Large Language Models (LLMs). The position involves model lifecycle management, performance monitoring, version control, and collaboration with Data Scientists and DevOps.
Requirements
- Develop and manage scalable deployment strategies for LLMs.
- Optimize LLM inference performance.
- Integrate prompt management, version control, and retrieval-augmented generation (RAG) pipelines.
- Manage vector databases, embedding stores, and document stores.
- Monitor hallucination rates, token usage, and overall cost optimization.
- Continuous monitor models for performance and alert system in place.
- Ensure compliance with ethical AI practices and privacy regulations.
- CI/CD pipeline design and maintenance.
- Model version control and reproducibility strategies.
- Automation of data ingestion, feature engineering, and model retraining.
- Model performance monitoring and alerting.
- Security, compliance, and governance protocols for model deployment.