A measure of similarity is required to find and compare cross-lingual articles concerning a specific topic. This measure can be based on bilingual dictionaries or based on numerical methods such as Latent Semantic Indexing (LSI). In this paper, we use LSI in two ways to retrieve Arabic-English comparable articles. The first way is monolingual: the English article is translated into Arabic and then mapped into the Arabic LSI space; the second way is cross-lingual: Arabic and English documents are mapped into Arabic-English LSI space. Then we compare LSI approaches to the dictionary-based approach on several English-Arabic parallel and comparable corpora. Results indicate that the performance of our cross-lingual LSI approach is competitive to the monolingual approach and even better for some corpora. Moreover, both LSI approaches outperform the dictionary approach.

Cross-lingual semantic similarity measure for comparable articles

Saad M.
Membro del Collaboration Group
;
2014-01-01

Abstract

A measure of similarity is required to find and compare cross-lingual articles concerning a specific topic. This measure can be based on bilingual dictionaries or based on numerical methods such as Latent Semantic Indexing (LSI). In this paper, we use LSI in two ways to retrieve Arabic-English comparable articles. The first way is monolingual: the English article is translated into Arabic and then mapped into the Arabic LSI space; the second way is cross-lingual: Arabic and English documents are mapped into Arabic-English LSI space. Then we compare LSI approaches to the dictionary-based approach on several English-Arabic parallel and comparable corpora. Results indicate that the performance of our cross-lingual LSI approach is competitive to the monolingual approach and even better for some corpora. Moreover, both LSI approaches outperform the dictionary approach.
2014
9783319108872
9783319108889
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11587/561297
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact