Cross-lingual document similarity

机译：交叉文档相似度

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we investigated how to compute similarities between documents written in different languages based on a weekly aligned multi-lingual collection of documents. Computing the cross-lingual similarities is based on an aligned set of basis vectors obtained by either latent semantic indexing or the k-means algorithm on an aligned multi-lingual corpus. We evaluated the methods on two data sets: Wikipedia and European Parliament Proceedings Parallel Corpus.

机译：在本文中，我们调查了如何根据每周对齐的多语言集合来计算以不同语言编写的文档之间的相似之处。计算交叉语言相似度基于通过潜在语义索引或k-means算法在对齐的多语言语料库上获得的对齐的基载载量集。我们评估了两种数据集的方法：维基百科和欧洲议会程序并行语料库。

著录项

来源
《International Conference on Information Technology Interfaces》|2012年||共6页
会议地点
作者
Muhic Andrej; Rupnik Jan; Skraba Primoz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G202-53;
关键词

相似文献

外文文献
中文文献
专利

1. Cross-lingual document similarity estimation and dictionary generation with comparable corpora [J] . Stajner Tadej, Mladenic Dunja Knowledge and information systems . 2019,第3期

机译：与可比语料库的交叉语言文档相似性估算与字典代
2. News Across Languages - Cross-Lingual Document Similarity and Event Tracking [J] . Fortuna Blaz, Grobelnik Marko, Leban Gregor, The Journal of Artificial Intelligence Research . 2016,第10期

机译：跨语言新闻-跨语言文档相似性和事件跟踪
3. News Across Languages - Cross-Lingual Document Similarity and Event Tracking [J] . Rupnik Jan, Muhic Andrej, Leban Gregor, The Journal of Artificial Intelligence Research . 2016,第Null期

机译：跨语言新闻-跨语言文档相似性和事件跟踪
4. Interesting cross-border news discovery using cross-lingual article linking and document similarity [C] . Boshko Koloski, Elaine Zosa, Timen Stepisnik-Perdih, EACL Hackashop on News Media Content Analysis and Automated Report Generation Conference . 2021

机译：有趣的跨境新闻发现使用跨语明文章链接和文档相似性
5. Semantic Similarity Detection in Natural Language Documents. [D] . Zhao, Lianyu. 2012

机译：自然语言文档中的语义相似性检测。
6. A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations [O] . Danushka Bollegala, Georgios Kontonatsios, Sophia Ananiadou -1

机译：用于检测生物医学术语翻译的跨语言相似性度量
7. Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification [O] . Mozhi Zhang, Yoshinari Fujinuma, Jordan Boyd-Graber 2020

机译：利用低资源文档分类中的交叉语言子字相似性

Cross-lingual document similarity

摘要

著录项

相似文献

相关主题

期刊订阅