items_header

Open projects

Projects available to all portals

Entertainment Identifier Registry
Los Angeles, California, United States
Richard Kroon
Director of Technical Operations
3
Preferred learners
  • Anywhere
  • Academic experience
Categories
Computer science & it Information technology Software development Databases Media
Project scope
What is the main goal for this project?

Our organization provides identification services for the global media and entertainment industry, (EIDR IDs are to movies and TV as ISBNs are to books, VINs are to cars, and UPC/EAN codes are to consumer products.) Descriptive metadata records for media programs are submitted in a wide variety of languages and scripts. We need to identify the languages used and produced normalized (transliterated and translated) versions for display and de-duplication.


We need to develop a service that will read each record in our database and:

  • Determine which language is used for each field
  • If the script is not in the Latin-1 character set, then:
  • Transliterate selected fields to Latin-1 (Romanize)
  • Translate other fields to English
  • Store the updated records in our database


This will involve several different steps for the students, including:

  • Familiarizing themselves with commercial language translation and transliteration tools
  • Familiarizing themselves with our XML-based API
  • Developing an architecture that will review and update our existing records and act as a testbed for future language tool development
  • Selecting the best technologies and tools for this project, given our existing technology stack and available resources
  • Building, testing, tuning, and deploying the service
  • Developing comprehensive documentation describing the service for operations and ongoing maintenance
What tasks will learners need to complete to achieve the project goal?

By the end of the project, students should demonstrate:

  • An improved understanding of linquistic terms and challenges
  • Familiarity with commercial language tools
  • Familiarity with common software project tools, including GitHub and Jira
  • Familiarity with the Scrum and Kanban project frameworks
  • An improved understanding of the global media market

Final deliverables should include

  • A working language identification, transliteration, and translation service
  • A technical presentation covering the alternatives explored, the decisions made, and the final product produced
  • A non-technical presentation delivered to our member companies introducing the new service
How will you support learners in completing the project?

Students will become part of our software development team. They will receive direct supervision and mentoring from our Technology Director and will have access to our professional developers for technical advice and assistance. The project will be broken down into a series of smaller deliverables with ongoing review and detailed feedback at each stage.

What skills or technologies will help learners to complete the project?

In order to complete this project, students can self-teach, but it is beneficial to be familiar with:

  • Agile software development practices
  • A modern, mainstream programming language (C#, Java, JavaScript, Python, TypeScript, etc.)
  • The basic operations of a REST API

Students will be expected to research and learn more about the above as the project goes along.

About the company
  • https://eidr.org
  • 2 - 10 employees
  • Entertainment, Media & production, Non-profit, philanthropic & civil society, Technology

The Entertainment Identifier Registry Association (EIDR) is a nonprofit industry association that supplies the global entertainment supply chain with universal identifiers for a broad array of audio visual objects. EIDR IDs are to movies, TV, games, and podcasts as ISBNs are to books, VINs are to cars, or UPC/EAN codes are to consumer products. The EIDR registry is, and always has been, read-for-free, though we do restrict write-access to authorized parties only. Our identifiers are critical to applications throughout the media and entertainment industry from production to public presentation, by archives, and in academic citation. Our Board includes Amazon, Google, Gracenote, NBCUniversal, Paramount, Sony Pictures, Disney, Warner Bros, and Xperi.