Building Specialized Multilingual Lexical Graphs Using Community Resources

机译：使用社区资源构建专门的多语言词汇图表

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users' behaviors to extract interesting patterns and facts (implicit approach). As a generic repository that can handle the collected multilingual terminological data, we are describing the concept of dedicated Multilingual Preterminological Graphs MPGs, and some automatic approaches for constructing them by analyzing the behavior of online community users. A Multilingual Preterminological Graph is a special lexical resource that contains massive amount of terms related to a special domain. We call it preterminological, because it is a raw material that can be used to build a standardized terminological repository. Building such a graph is difficult using traditional approaches, as it needs huge efforts by domain specialists and terminologists. In our approach, we build such a graph by analyzing the access log files of the website of the community, and by finding the important terms that have been used to search in that website, and their association with each other. We aim at making this graph as a seed repository so multilingual volunteers can contribute. We are experimenting this approach with the Digital Silk Road Project. We have used its access log files since its beginning in 2003, and obtained an initial graph of around 116000 terms. As an application, we used this graph to obtain a preterminological multilingual database that is serving a CLIR system for the DSR project.

机译：我们正在描述从各种资源编译域专用的多语言术语数据的方法。我们专注于将数据从网上社区用户收集到主要来源，因此，我们的方法取决于获取志愿者（明确方法）的贡献，这取决于分析用户的行为，以提取有趣的模式和事实（隐式方法）。作为能够处理收集的多语言术语数据的通用存储库，我们正在描述专用的多语种前言方式MPG的概念，以及通过分析在线社区用户的行为来构建它们的一些自动方法。多语种前言图是一种特殊的词汇资源，包含与特殊域相关的大量术语。我们称之为前言，因为它是一种原料，可用于构建标准化的术语存储库。建立这种图形是难以使用传统方法的，因为它需要域名专家和术语学家的巨大努力。在我们的方法中，我们通过分析社区网站的访问日志文件来构建此类图形，并通过查找已用于在该网站中搜索的重要术语，以及它们相互关联。我们的目标是使这个图形作为种子存储库，因此多语种志愿者可以贡献。我们正在用数字丝绸之路项目试验这种方法。自2003年开始以来，我们使用了访问日志文件，并获得了大约116000左右的初始图。作为应用程序，我们使用此图来获取为DSR项目提供CLIR系统的原料多语言数据库。

著录项

来源
《International Workshop on Resource Discovery》|2010年||共16页
会议地点
作者
Mohammad Daoud; Christian Boitet; Kyo Kageura; Asanobu Kitamoto; Mathieu Mangeot; Daoud Daoud;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Activating learning using multilingual CALL lexical resources: A regional culture-oriented multilingual visual dictionary project [J] . Janet M.D. Higgins Procedia - Social and Behavioral Sciences . 2012,第2期

机译：使用多语言CALL词汇资源激活学习：一个面向区域文化的多语言视觉词典项目
2. DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF [J] . Gilles Sérasset Semantic web . 2015,第4期

机译：DBnary：Wiktionary作为RDF中基于Lemon的多语言词汇资源
3. Multilingual resources for NLP in the lexical markup framework (LMF) [J] . Gil Francopoulo, Nuria Bel, Monte George, Computers and the Humanities . 2009,第1期

机译：词汇标记框架（LMF）中NLP的多语言资源
4. Building Specialized Multilingual Lexical Graphs Using Community Resources [C] . Mohammad Daoud, Christian Boitet, Kyo Kageura, International Workshop on Resource Discovery . 2010

机译：使用社区资源构建专门的多语言词汇图表
5. Automatically creating multilingual lexical resources [D] . Lam, Khang Nhut 2015

机译：自动创建多语言词汇资源
6. The Community Tool Box: a Web-based resource for building healthier communities. [O] . S B Fawcett, V T Francisco, J A Schultz, 2000

机译：社区工具箱：基于Web的资源用于建立更健康的社区。
7. Building Specialized Multilingual Lexical Graphs Using Community Resources [O] . Mohammad Daoud, Christian Boitet, Kyo Kageura, 2015

机译：利用社区资源构建专业多语言词汇图

Building Specialized Multilingual Lexical Graphs Using Community Resources

摘要

著录项

相似文献

相关主题

期刊订阅