Project !Trans (no-translate) aims at developing technology to spot texts for which even modern machine translation technologies run short due to terms/domains heavily rooted in a region/culture, causing valid translations to hardly exist in other languages.
The project is devoted to the preservation of one of the least known intangible human heritage: gastronomy. The target users are different kinds of visitors (tourists, exchange students), but it can also impact other user types, as far as they are non-native speakers of the source language --Italian.
The practical problem concerns various scenarios in which a non-native speaker (e.g., a tourist or an exchange student) faces difficulties to understand a text due to the limited command of a specialised vocabulary (e.g., the terminology of gastronomy).
Often, relying on (machine) translation is not optimal since, for a number of terms, no translation exists at all. In such a scenario, an alternative solution becomes necessary: displaying the definition/description of the term (e.g., a menu entry, such as strozzapreti or quinto quarto), in Italian or another language (English or the user’s native language).
Producing and organising in-domain multilingual corpora: acquisition of comparable in-domain texts in both Italian and English (e.g., gastronomic tradition,restaurant menus). The datasets will be organised with the help of one or more automatic models for the extraction and classification of the texts.
[project GastRoWiki and ongoing internships]
Developing supervised models to assess the (inter-)comprehensibility of a text and the feasibility of producing a correct translation out of it.
[continuing the work of Fernicola F.]
Development of supervised models for the retrieval of multilingual definitions in order to explain complex/specific/untranslatable terms from the source culture/language.
[continuing the work of Martinelli M.]
We're showcasing an interface prototype to acquire texts and return an “enriched” version, including the definitions for the automatically spotted untranslatable terms.