You will work with one of the most extensive data sets of medical records, diagnoses, claims, and prescriptions. This role offers a unique opportunity to train, fine-tune, and use AI models using medical data collected from millions of patients across the country.
Primary Duties:
- Train and fine-tune models using off-the-shelf and novel ML/AI techniques solving optimization problems for the company.
- Work with large, complex data sets. Conducting difficult, non-routine analysis and harvesting data.
- Deliver working POC solutions solving speed, scalability and time-to-market tradeoffs.
Minimum Qualifications:
- BA/BTech in Statistics, Data Science, Computer Science or a related field require.
- 6+ years of relevant statistical analysis experience.
- 6+ years of relevant machine learning experience (ML modeling, hyperparameter tuning, feature engineering, model validation etc).
- Understanding of causal inference and treatment effects estimation.
- 3-5 years of experience selecting, implementing, and optimizing ML tools and frameworks for large-scale projects.
- 2+ years of Python language experience.
- 1+ years of relevant deep learning and LLM experience.
- 1+ years experience working with large-scale distributed systems at scale and statistical software (e.g. Spark).
- Experience in addressing challenges from incomplete, unrepresentative, and mislabeled data.
- Contributions to the field (e.g., publications, patents, or successful large-scale implementations).
Preferred KSA’s:
- Master or PhD degree in a quantitative discipline (e.g., Computer Science[with AI/ML Major], Statistics, Operations Research, Economics, Mathematics, Physics) or equivalent practical experience.
- Background in Epidemiology, particularly in the context of chronic condition modeling.
- Working knowledge of Public Health, with a focus on Value-Based Care and Risk adjustment.
- Working knowledge of health-tech systems, such as Electronic Health Records and clinical data.
- Proficiency in communicating analysis and establishing confidence among audiences who do not share your disciplinary background or training.
- Experience with security and systems that handle sensitive data.
- Experience working with statistical software (e.g. R, SAS, Python statistical packages).
- Demonstrated leadership and self-direction.
- Publications at peer-reviewed conferences (e.g. NeurIPS, ICML, ACL, JSM, KDD, EMNLP).
- Participation in ACIC Data Challenge, Kaggle etc.
Physical Requirements:
- Sitting for prolonged periods of time. Extensive use of computers and keyboard. Occasional walking and lifting may be required.