Himalayas logo
ZyteZY

Data Scientist - Gen AI / QA - Remote

You might know us as Scrapinghub. And now we’re Zyte.

Zyte

Employee count: 51-200

Argentina only

About Us

At Zyte, we eat data for breakfast and you can eat your breakfast anywhere and work for Zyte. Founded in 2010, we are a globally distributed team of over 250 Zytans working from over 28 countries who are on a mission to enable our customers to extract the data they need to continue to innovate and grow their businesses. We believe that all businesses deserve a smooth pathway to data

For more than a decade, Zyte has led the way in building powerful, easy-to-use tools to collect, format, and deliver web data, quickly, dependably, and at scale. And today, the data we extract helps thousands of organizations make smarter business decisions, secure competitive advantage, and drive sustainable growth. Today, over 3,000 companies and 1 million developers rely on our tools and services to get the data they need from the web.

Data QA is an important function within Zyte. The Data QA team works to ensure that the quality and usability of the data scraped by our web scrapers meets and exceeds the expectations of our enterprise clients.

Are you passionate about data and data quality and integrity?

Do you enjoy using Python and AI to analyze and manipulate data, detect data quality issues, and visualize your findings?

Are you highly customer-focused with excellent attention to detail?

Owing to growing business and the need for ever more sophisticated Data QA, we are looking for a talented Data Scientist to join our team. As a Zyte Engineer, you work on AI-based data wrangling, data manipulation, and data visualisation techniques and apply them in the verification and validation of data quality as it pertains to data extracted from the web.

Requirements

Roles & Responsibilities:

  • Understand customer web scraping and data requirements; map these requirements to custom AI-based data quality validation techniques, with a focus on achieving pre-established degrees of data quality and uncovering data quality issues.
  • Draw conclusions about data quality by producing descriptive and evidence-based statistics, summaries, and visualisations.
  • Supplement existing manual QA and schema validation techniques with AI-based data quality verification.
  • Collaborate with developers to further troubleshoot and pinpoint solutions.
  • Present findings and conclusions to stakeholders at various levels (other members of the QA department, developers, project managers, account managers, customers).
  • Write high-quality, well-structured code that is maintainable and extensible.
  • Manage code using GitHub, BitBucket and other version control approaches as applicable.

Requirements:

  • Highly proficient in Python and the PyData stack. Minimum of 3 years (please provide code samples in your application - ideally pertaining to data analysis or Generative AI - via a link to GitHub or other publicly-accessible service).
  • BS degree in Computer Science, Engineering, Mathematics, Statistics or equivalent.
  • Up to speed on the latest advances in Generative AI particularly as they pertain to process automation, web scraping/parsing, and data quality verification.
  • Comfortable with Prompt Engineering and token/cost optimization.
  • Familiar with abstraction layers (MCP, Marvin, Langchain etc).
  • Experience coding against the APIs of at least one of the Google, OpenAI, or Anthropic models.
  • Experience in data quality visualization and the visualisation of data quality issues.
  • Ability to work with very large datasets (into the millions of records).
  • Strong knowledge of software QA methodologies, tools, and processes.
  • Excellent level of written and spoken English; confident communicator; able to communicate on both technical and non-technical levels with various stakeholders on all matters of QA.
  • Outstanding attention to detail.

Desired Skills:

  • Prior experience in a Data QA role (where the focus was on verifying data quality, rather than testing application functionality).
  • Familiarity with Jupyter and JupyterLab.
  • Experience building your own dashboards.
  • Experience with Spark, BigQuery, and other big data technologies.
  • Previous remote working experience.

Benefits

As a new Zytan, you will:

Become part of a self-motivated, progressive, multi-cultural team.

Have the freedom and flexibility to work from where you do your best work.

Attend conferences and meet with team members from across the globe.

Work with cutting-edge open source technologies and tools.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Mid-level

Location requirements

Hiring timezones

Argentina +/- 0 hours

About Zyte

Learn more about Zyte and their company culture.

View company profile

You might know us as Scrapinghub. And now we’re Zyte. We’re game changers in web data extraction, obsessed with removing barriers so our customers can access valuable data. Quickly and easily, whenever and however they need it.

We’ve always been passionate about data and what it can do. And we’re here to connect our customers with clean, actionable web data. At any scale. Without coding hassles, getting banned or broken spiders.

At Zyte we believe that businesses deserve a smooth pathway to data. For more than a decade we’ve led the way in building powerful, easy to use ways to collect, format and deliver web data, quickly, dependably and at scale. And today the data we extract helps thousands of organizations make smarter business decisions, secure competitive advantage and drive sustainable growth.

Our company values

Open by default

  • We believe in open free flowing channels of information for all.

  • We’re open minded and embrace change.

  • We communicate openly and honestly with each other.

  • We encourage a flexible and diverse work environment.

Team Players

  • We help each other to do great work.

  • We treat each other with respect, even when we disagree.

  • We work in teams with humility and ambition.

  • We rely on each other and create the best solutions together.

Customer Centric

  • We put our customers at the heart of everything we do.

  • We listen to and understand our customer’s needs.

  • We go above & beyond to provide the best solutions for our customers.

  • Making our customers successful is everyone’s job.

Game Changers

  • We deliver innovation that matters.

  • We never settle, there is always an opportunity to do better.

  • We challenge our ideas of what’s possible.

  • We’re not afraid to take risks and fail.

Employee benefits

Learn about the employee benefits and perks provided at Zyte.

View benefits

Open source

Work with cutting-edge open source technologies and tools.

Company events

Attend conferences and meet with team members from across the globe.

Generous vacation

We have a generous 35 days of PTO to help encourage work life balance.

View Zyte's employee benefits
Claim this profileZyte logoZY

Zyte

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

15 remote jobs at Zyte

Explore the variety of open remote roles at Zyte, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Zyte

Remote companies like Zyte

Find your next opportunity by exploring profiles of companies that are similar to Zyte. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Zyte hiring Data Scientist - Gen AI / QA - Remote • Remote (Work from Home) | Himalayas