Skip to main content
HimalayasHimalayas logo
JC
Open to opportunities

Jason Chou

@jasonchou

Senior data scientist delivering NLP, semantic search, and production ML solutions.

United States
Message

What I'm looking for

I’m looking for a team where I can ship production NLP and semantic search systems end-to-end—modeling, evaluation with XAI, and scalable pipelines—partnering with stakeholders in Agile while owning measurable performance and latency improvements.

I’m a data science and machine learning professional with 7+ years delivering NLP, semantic search, and entity resolution for document-centric analytics. I thrive on turning mission requirements into prioritized AI and analytics use cases with real stakeholder impact.

In my work, I’ve designed transformer-based retrieval and ranking models in PyTorch and HuggingFace, improving precision@10 by 22%. I’ve built OCR/ICR-driven ingestion pipelines that processed 5M documents and reduced parsing errors by 47%, and I’ve engineered end-to-end semantic search integrated with Elasticsearch and Databricks to cut median query latency by 40% and boost recall by 30%.

I also focus on production readiness and measurable trust: I’ve implemented graph-based entity resolution with Neo4j to reduce duplicates by 78% across a 20M-entity index, deployed models via REST APIs and scheduled batch pipelines using Docker, Airflow, and cloud services, and used SHAP/LIME-based explainability to increase XAI adoption by 65% across pilots.

Experience

Work history, roles, and key accomplishments

IG

Senior Data Scientist / ML Eng

Integer Group

Sep 2021 - Mar 2026 (4 years 6 months)

Delivered NLP and semantic search pilots with 10+ stakeholders, improving precision@10 by 22% using transformer-based retrieval and reducing median query latency by 40% while boosting recall by 30%. Built end-to-end OCR/ICR ingestion for 5M documents (47% fewer parsing errors), enabled entity resolution to cut duplicates by 78%, and shipped 12 Tableau/Looker dashboards that reduced analyst time-to

IG

Data Scientist / ML Engineer

Integer Group

Feb 2019 - Aug 2021 (2 years 6 months)

Partnered with product owners and data engineers to scope and run reproducible Databricks experiments, contributing to 6 pilot initiatives. Improved NER recall by 18% and precision by 12%, reduced data freshness from 48 hours to 4 hours with Airflow/Spark scheduling, and built a 2M-page labeled OCR corpus for downstream NLP models.

Education

Degrees, certifications, and relevant coursework

University of Texas at Dallas logoUD

University of Texas at Dallas

Master of Science in Computer Science, Computer Science

2012 - 2018

Earned a Master of Science in Computer Science at the University of Texas at Dallas from 2012 to 2018.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan