...
首页> 外文期刊>Language Resources and Evaluation >A lexicon for Vietnamese language processing
【24h】

A lexicon for Vietnamese language processing

机译:越南语处理词典

获取原文
获取原文并翻译 | 示例
           

摘要

Only very recently have Vietnamese researchers begun to be involved in the domain of Natural Language Processing (NLP). As there does not exist any published work in formal linguistics nor any recognizable standard for Vietnamese word definition and word categories, the fundamental tasks for automatic Vietnamese language processing, such as part-of-speech tagging, parsing, etc., are very difficult tasks for computer scientists. The fact that all necessary linguistic resources have to be built from scratch by each research team is a real obstacle to the development of Vietnamese language processing. The aim of our projects is thus to build a common linguistic database that is freely and easily exploitable for the automatic processing of Vietnamese. In this paper, we present our work on creating a Vietnamese lexicon for NLP applications. We emphasize the standardization aspect of the lexicon representation. We especially propose an extensible set of Vietnamese syntactic descriptions that can be used for tagset definition and morphosyntactic analysis. These descriptors are established in such a way as to be a reference set proposal for Vietnamese in the context of ISO subcommittee TC 37/SC 4 (Language Resource Management).
机译:直到最近,越南研究人员才开始涉足自然语言处理(NLP)领域。由于没有正式语言学方面的已发表著作,也没有越南语单词定义和单词类别的公认标准,因此自动越南语处理的基本任务(例如词性标记,解析等)是非常困难的任务对于计算机科学家。每个研究团队必须从头开始构建所有必要的语言资源,这实际上是越南语言处理发展的真正障碍。因此,我们项目的目的是建立一个通用的语言数据库,该数据库可自由,轻松地用于越南语的自动处理。在本文中,我们介绍了我们为NLP应用程序创建越南语词典的工作。我们强调词典表示的标准化方面。我们特别提出了一组可扩展的越南句法描述,可用于标记集定义和词法句法分析。这些描述符的建立方式应成为ISO小组委员会TC 37 / SC 4(语言资源管理)中越南语的参考集建议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号