Himalayas logo
ZyteZY

Machine Learning Engineer - Web Data Quality

You might know us as Scrapinghub. And now we’re Zyte.

Zyte

Employee count: 51-200

Brazil only

At Zyte, we make the world’s web data accessible to everyone. Our technology powers data extraction at scale, helping businesses and researchers unlock the full potential of the web.

We’re a remote-first, multicultural team of engineers, data scientists, and innovators who believe in curiosity, collaboration, and continuous learning. If you’re passionate about building reliable AI systems and improving the quality of web data, we’d love to hear from you.

About the Role

As a Machine Learning Engineer (Web Data Quality), you’ll design and implement intelligent systems that automatically detect, measure, and improve the quality of large-scale web datasets. You’ll work at the intersection of data science, AI, and distributed systems, collaborating closely with product, engineering, and data teams to make data accuracy measurable, scalable, and actionable.

Requirements

What You’ll Do

  • Develop and deploy ML models for anomaly detection, schema drift, and content validation
  • Build and improve data quality pipelines leveraging modern data and MLOps tools
  • Design and optimize embeddings and GenAI models to enhance data consistency
  • Collaborate with engineers to integrate AI systems into production workflows
  • Conduct experiments, evaluate performance, and iterate for continuous improvement
  • Stay up to date on AI/ML and GenAI research to guide innovation within Zyte

Required

  • 3+ years of experience in Machine Learning / Data Science / AI Engineering
  • Strong Python skills and experience with ML frameworks (PyTorch, TensorFlow, scikit-learn)
  • Experience with data validation, anomaly detection, or data quality systems
  • Familiarity with data pipelines (Airflow, Spark, or similar)
  • Understanding of model evaluation, metrics, and deployment best practices
  • Excellent problem-solving, communication, and collaboration skills

Preferred

  • Experience with LangChain, LlamaIndex, or GenAI model orchestration
  • Familiarity with data labeling tools and active learning approaches
  • Contributions to open-source or public ML projects
  • Experience working in a remote, cross-functional team environment

Benefits

  • 35 days of paid time off
  • Health & wellness support
  • Inclusive and supportive team environment
  • Attend conferences and meet with team members from across the globe.
  • Work with cutting-edge open source technologies and tools

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Mid-level

Location requirements

Hiring timezones

Brazil +/- 0 hours

About Zyte

Learn more about Zyte and their company culture.

View company profile

You might know us as Scrapinghub. And now we’re Zyte. We’re game changers in web data extraction, obsessed with removing barriers so our customers can access valuable data. Quickly and easily, whenever and however they need it.

We’ve always been passionate about data and what it can do. And we’re here to connect our customers with clean, actionable web data. At any scale. Without coding hassles, getting banned or broken spiders.

At Zyte we believe that businesses deserve a smooth pathway to data. For more than a decade we’ve led the way in building powerful, easy to use ways to collect, format and deliver web data, quickly, dependably and at scale. And today the data we extract helps thousands of organizations make smarter business decisions, secure competitive advantage and drive sustainable growth.

Our company values

Open by default

  • We believe in open free flowing channels of information for all.

  • We’re open minded and embrace change.

  • We communicate openly and honestly with each other.

  • We encourage a flexible and diverse work environment.

Team Players

  • We help each other to do great work.

  • We treat each other with respect, even when we disagree.

  • We work in teams with humility and ambition.

  • We rely on each other and create the best solutions together.

Customer Centric

  • We put our customers at the heart of everything we do.

  • We listen to and understand our customer’s needs.

  • We go above & beyond to provide the best solutions for our customers.

  • Making our customers successful is everyone’s job.

Game Changers

  • We deliver innovation that matters.

  • We never settle, there is always an opportunity to do better.

  • We challenge our ideas of what’s possible.

  • We’re not afraid to take risks and fail.

Employee benefits

Learn about the employee benefits and perks provided at Zyte.

View benefits

Open source

Work with cutting-edge open source technologies and tools.

Company events

Attend conferences and meet with team members from across the globe.

Generous vacation

We have a generous 35 days of PTO to help encourage work life balance.

View Zyte's employee benefits
Claim this profileZyte logoZY

Zyte

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

12 remote jobs at Zyte

Explore the variety of open remote roles at Zyte, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Zyte

Remote companies like Zyte

Find your next opportunity by exploring profiles of companies that are similar to Zyte. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan