Téléchargement | - Voir le manuscrit accepté : Filtering and routing multilingual documents for translation (PDF, 554 Kio)
|
---|
DOI | Trouver le DOI : https://doi.org/10.1109/CISDA.2012.6291536 |
---|
Auteur | Rechercher : Carpuat, Marine1; Rechercher : Goutte, Cyril1; Rechercher : Isabelle, Pierre1 |
---|
Affiliation | - Conseil national de recherches du Canada. Technologies de l'information et des communications
|
---|
Format | Texte, Article |
---|
Conférence | 2012 IEEE Symposium on Computational Intelligence for Security and Defence Applications (CISDA), July 11-13, 2012, Ottawa, Ontario, Canada |
---|
Résumé | Translation is a key capability to access relevant information expressed in various languages on social media. Unfortunately, systematically translating all content far exceeds the capacity of most organizations. Computer-aided translation (CAT) tools can significantly increase the productivity of translators, but can not ultimately cope with the overwhelming amount of content to translate. In this contribution, we describe and experiment with an approach where we use the structure in a corpus to adequately route the content to the proper workflow, including translators, CAT tools or purely automatic approaches. We show that linguistically motivated structure such as document genre can help decide on the proper translation workflow. However, automatically discovered structure has an effect that is at least as important and allows us to define groups of documents that may be translated automatically with reasonable output quality. This suggests that computational intelligence models that can efficiently organize document collection will provide increased capability to access textual content from various target languages. |
---|
Date de publication | 2012-07-13 |
---|
Dans | |
---|
Langue | anglais |
---|
Publications évaluées par des pairs | Oui |
---|
Numéro NPARC | 20794315 |
---|
Exporter la notice | Exporter en format RIS |
---|
Signaler une correction | Signaler une correction (s'ouvre dans un nouvel onglet) |
---|
Identificateur de l’enregistrement | 12610fd4-0110-475e-bc5c-0dfc0481f7de |
---|
Enregistrement créé | 2012-10-12 |
---|
Enregistrement modifié | 2020-04-21 |
---|