Himalayas logo
OmiliaOM

Middle Speech Applied Scientist - Multilingual Voice, Voice Cloning, Locale Expa

Omilia is a Conversational AI company that provides an enterprise-grade cloud platform for automated voice and chat customer service solutions, aiming to improve customer experience and reduce operational costs.

Omilia

Employee count: 201-500

Greece only

Stay safe on Himalayas

Never send money to companies. Jobs on Himalayas will never require payment from applicants.

As an R&D TTS Speech Applied Scientist, you will be a core member of our Speech Team, dedicated to expanding our high-quality TTS services into new languages. Your focus will be on the end-to-end process of localizing and deploying synthetic voices, from initial linguistic analysis and data preparation to final model deployment.

Responsibilities

The primary focus is the research, development, and implementation of robust TTS and Voice Cloning systems for global expansion into new and diverse locales, covering the entire speech synthesis pipeline.

1. End-to-End Pipeline Ownership

This role requires end-to-end involvement in launching TTS for new languages, ensuring quality and scalability across all stages.

  • Language Analyses: Conduct thorough language analysis, phonetic/phonology studies, and define the phoneme set for new target languages. Design and implement the lexicon and G2P (Grapheme-to-Phoneme) development process to ensure accurate pronunciation modeling.
  • Voice Creation & Data Curation: Actively participate in the initial stages of voice creation projects:
  • Assist in voice talent selection to meet aesthetic and linguistic requirements.
  • Collaborate on corpus design, defining sentence structure and coverage, and overseeing the corpus creation, recording, and quality review process.
  • Cooperate on voice style design to define the desired emotional and speaking characteristics for the synthetic voice.
  • TTS Model Development:
  • Lead TTS model training/evaluation for multiple languages, ensuring high-quality synthesis and speaker consistency.
  • Adapt or extend current TTS Voices data flow/pipeline for new languages and actively contribute to developing/training new models.
  • Quality Assurance: Conduct rigorous listening tests (e.g., MOS score evaluation) and error analysis to drive model improvements, collaborating closely with internal listening testers.

2. Voice Cloning & Low-Resource Research

  • Conduct applied research into low-resource voice adaptation and few-shot voice cloning techniques to rapidly deploy high-quality new voices across various markets.

3. Production & Scale

  • Work closely with MLOps and engineering teams to transition successful models into a low-latency, high-scale production environment for global deployment.
  • Prototype new research ideas and optimize existing model architectures for real-world performance.

4. Agile Methodologies & Collaboration

  • Actively participate in Agile software development processes, including sprint planning, daily stand-ups, and retrospectives to ensure timely and high-quality deliverables.
  • Work closely with cross-functional teams, including product managers, designers, and other engineers, to gather requirements and ensure alignment on project goals.
  • Participate in project planning, including research and development.
  • Contribute to the backlog of tasks with improvements and suggestions.
  • Implement Proof of Concepts (PoC) to introduce new solutions and ideas to the team.
  • Effectively manage time and meet deadlines.

5. Contribute actively and effectively as an integrated team member

  • Meet regularly with the line manager to review progress.
  • Manage issue resolution and critically escalate.
  • Work effectively with other teams, units, and departments.
  • Manage issues with clarity and ensure effective information flow and team working.
  • Support organization's other priority activities, when necessary.
  • Act as an Omilia ambassador.

Requirements

  • MSc degree in Computer Science, Engineering, or a related subject.
  • 2+ years of experience in speech synthesis development roles.
  • Ph.D. in a relevant field is a plus but not required.
  • Proven experience in developing AI-driven applications, particularly in speech synthesis, voice cloning, or related fields.
  • Strong understanding of state-of-the-art voice LLM techniques
  • Proficiency in Python and deep learning frameworks like PyTorch or TensorFlow.
  • Hands-on experience with TTS frameworks (e.g. FastPitch, VITS, StyleTTS, StyleTTS2) and neural vocoders (e.g., HiFi-GAN, WaveGlow, Vocos)
  • Hands-on experience with LLMs, Diffusion models and Neural Audio Codecs.
  • Familiarity with zero-shot synthesis approaches and multi-speaker TTS systems.
  • Self-motivated and driven to create extraordinary things.
  • Ability to work under pressure and on strict deadlines.
  • Continuous innovation mindset.
  • Excellent written and oral communication skills in English.
  • Effective time management skills and the ability to meet deadlines.

Nice to have

  • Experience with AWS cloud platform for scalable model deployment and monitoring.
  • Experience with NVIDIA Triton Inference Server.
  • Experience with MLOps practices.

Benefits

  • Fixed compensation;
  • Long-term employment with the working days vacation;
  • Development in professional growth (courses, training, etc);
  • Being part of successful cutting-edge technology products that are making a global impact in the service industry;
  • Proficient and fun-to-work-with colleagues;
  • Apple gear.

Omilia is proud to be an equal opportunity employer and is dedicated to fostering a diverse and inclusive workplace. We believe that embracing diversity in all its forms enriches our workplace and drives our collective success. We are committed to creating an environment where everyone feels welcomed, valued, and empowered to contribute their unique perspectives without regard to factors such as race, color, religion, gender, gender identity or expression, sexual orientation, national origin, heredity, disability, age, or veteran status, all eligible candidates will be given consideration for employment.

About the job

Apply before

Posted on

Job type

Full Time

Experience level

Senior

Location requirements

Hiring timezones

Greece +/- 0 hours

About Omilia

Learn more about Omilia and their company culture.

View company profile

Omilia is a Conversational AI pioneer, dedicated to revolutionizing how customers interact with enterprises. Many customers experience frustration with traditional automated systems, like complex IVR menus, that fail to understand their needs or provide efficient solutions. This is why Omilia developed its enterprise-grade Omilia Cloud Platform (OCP). Our platform empowers businesses to deploy advanced voice and chat AI assistants that engage in natural, end-to-end conversations, making customer service more intuitive and effective. We understand that businesses need to cut costs, protect their customers, and ultimately, delight them with superior service. Our solutions are designed to deliver rapid ROI by automating self-service, which frees up human agents to concentrate on high-value, complex interactions. Furthermore, Omilia incorporates robust contact center security, including voice biometric verification and multi-layered anti-fraud mechanisms, to safeguard customer data and ensure regulatory compliance.

Our customers span various industries, including finance, insurance, retail, utilities, automotive, travel, hospitality, and healthcare, all facing the common challenge of meeting ever-increasing customer expectations for 24/7, personalized service. Omilia addresses these needs by providing a suite of AI-driven tools. This includes Conversational Voice & Chat, Contact Centre Security, Conversational Insights for data analytics, AI Agent Assist for real-time support to human agents, and Workforce AI for call quality management. We are committed to helping enterprises transform their customer care by providing technology that not only understands what customers are saying but also the intent behind their words. This deep understanding allows for higher task completion rates and a significant increase in self-service containment. Omilia started in a small garage in 2002 with a vision to reinvent customer service, and today, we are proud to serve billions of conversations in numerous languages across multiple countries, consistently recognized by industry leaders like Gartner and IDC for our innovative and impactful solutions.

Claim this profileOmilia logoOM

Omilia

View company profile

Similar remote jobs

Here are other jobs you might want to apply for.

View all remote jobs

11 remote jobs at Omilia

Explore the variety of open remote roles at Omilia, offering flexible work options across multiple disciplines and skill levels.

View all jobs at Omilia

Remote companies like Omilia

Find your next opportunity by exploring profiles of companies that are similar to Omilia. Compare culture, benefits, and job openings on Himalayas.

View all companies

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan
Omilia hiring Middle Speech Applied Scientist - Multilingual Voice, Voice Cloning, Locale Expa • Remote (Work from Home) | Himalayas