Accueil du site Accueil du site Adhésion Contact Plan du site

Transliteration as Alignment vs. Transliteration as Generation for Crosslingual Information Retrieval

Anil Kumar Singh, Sethuramalingam Subramaniam, Taraka Rama

Language Technologies Research Centre
[anil@research, sethu@research, taraka@students]

Crosslingual Information Retrieval (CLIR) usually requires query translation and, due to named entities in the case of IR, query translation requires a good transliteration system when writing systems differ. Transliteration can be seen as a problem of generation or alignment. For IR, since we can extract a word list from the corpus being searched, it should be seen as an alignment problem. The shift from generation to alignment can lead to higher transliteration accuracies and significant improvements in the CLIR results. We were able to achieve an increase (over generation) in the CLIR Mean Average Precision by 22.66% and 29.08% for English to Hindi and English to Marathi, respectively.

Fichier PDF
Anil Kumar Singh, Sethuramalingam Subramaniam, Taraka Rama
365.4 ko

TAL Volume 51 2010 . 2. Multilinguisme et TAL

Date de dernière mise à jour : 26 mars 2012, auteur : Rédacteurs en chef.