Riipen

Find projects

Discover and match with projects from real companies

Back to marketplace

Multi Language update to UNOMi 3D Lip Sync

Open

Pixel Pirate Studio

Los Angeles, California, United States

Obi Onyejekwe

Chief Operating Officer

Project

Academic experience

80 hours of work total

Learner

Anywhere

Advanced level

Project scope

Skills

russian language javascript (programming language) lip sync python (programming language) application programming interface (api) korean language german language kaldi front end design french language

Details

We're looking for an Intern who understands Java Script to update our UNOMi 3D Lip-Sync plugin. This is a great opportunity to learn about how a startup works and gain experience.

PROJECT OVERVIEW

Languages to Add

Mandarin, Japanese, Hindi, Arabic, Spanish, Portuguese, German, French, Korean, Russian

Tech Stack

ASR & Alignment: Kaldi
Backend: Python + Node.js/JavaScript
Frontend: Angular
Phoneme Processing: G2P tools (e.g., Espeak, Phonetisaurus), IPA-based mappings

Deliverables

PHASE 1: Planning & Resources (Week 1)

• Language Resource Audit: Identify available Kaldi recipes and G2P tools for each target language

• Model Selection: Choose between pre-trained or training new Kaldi models

• Viseme Set Design: Create or map a universal viseme set for multilingual phoneme integration

• Timeline Finalization: Confirm team bandwidth and resource allocation

PHASE 2: Model Setup & G2P (Weeks 2–4)

• Setup Kaldi Environments: Spin up Docker or server instances per language

• Integrate Pretrained Models or Begin Training: Load pretrained models (e.g., CommonVoice, Aishell, GlobalPhone) or begin training with transcribed data

• G2P Mapping: Integrate Espeak/Phonetisaurus or language-specific G2P models

• Test G2P Conversions: Validate phoneme output against expected transcription with native speakers, where possible

PHASE 3: Phoneme Alignment & Backend Processing (Weeks 5–6)

• Forced Alignment: Use Kaldi to align audio + transcript to phoneme timing for each language

• Standardize Output: Normalize output format (JSON, SRT, XML) for lip-sync engine compatibility

• Python Middleware Update: Update backend to handle phoneme inputs from each language and pass to animation engine

• JavaScript Integration: Update JS logic for language detection and ASR model switching

PHASE 4: Frontend & Viseme Mapping (Weeks 7–8)

• Viseme Map Expansion: Add mappings from phonemes (IPA or language-specific) to viseme shapes

• Angular Component Update: Expand Angular UI to support language selection or detection

• Integrate Viseme Timings: Sync timing data with animation engine (possibly via WebSocket or event stream)

• User Testing (Internal): Run internal tests on lip sync output per language

PHASE 5: QA, Optimization & Launch (Weeks 9–10)

• QA Pass (External Review): Collect feedback from native speakers for each language

• Optimize Latency: Tune backend processing (Kaldi decode/align) for sub-5s performance if possible

• Final Bug Fixes: Polish frontend and backend features based on QA feedback

• Deploy: Roll out support to production and announce multilingual support

DELIVERABLES BY PHASE:

• Phase 1: Timeline, toolchain finalized, phoneme–viseme map draft

• Phase 2: All G2P modules ready, ASR models integrated

• Phase 3: JSON-formatted phoneme timings output from Kaldi

• Phase 4: Frontend updated, 3D characters lip-syncing to new languages

• Phase 5: Fully functioning, optimized, and tested UNOMi with multilingual support

ADDITIONAL NOTES

Pretrained Kaldi models:
Spanish, French, German: CommonVoice
Mandarin: AISHELL, THCHS-30
Arabic: MGB Challenge
Russian: VoxForge or RUSLANA
Hindi: MUCS 2021 Challenge dataset
Japanese/Korean: You may need to use Mozilla’s Whisper, OpenAI Whisper, or train your own
Alternative ASR/G2P fallback: If Kaldi support is weak for some languages, Whisper or Vosk may fill the gap.

Mentorship

Domain expertise and knowledge

Providing specialized, in-depth knowledge and general industry insights for a comprehensive understanding.

Skills, knowledge and expertise

Sharing knowledge in specific technical skills, techniques, methodologies required for the project.

Hands-on support

Direct involvement in project tasks, offering guidance, and demonstrating techniques.

Tools and/or resources

Providing access to necessary tools, software, and resources required for project completion.

Regular meetings

Scheduled check-ins to discuss progress, address challenges, and provide feedback.

About the company

Company

http://pixelpiratestudio.com/

Los Angeles, California, United States

2 - 10 employees

Entertainment, Technology

Representation

Minority-Owned

UNOMi is innovative and easy to use software for animators. UNOMi reduces the production time and budget on developing content by 30% to 70%. It does this by automatically syncing 2D and 3D mouth poses to voice-over recordings of each character that an artist or animator creates. We understand the pain that is involved in producing quality animated content and we’ve created the perfect tool to help with the process. It normally takes animators about a day to animate one character talking for 30 seconds but with UNOMi animators can get that done in seconds.

UNOMi’s top mission is to solve the greatest challenges facing animators today. With the level of technology in the world, there is no reason why animators should still be struggling to tell their stories.