首页> 外文期刊>International Journal of Computer Processing of Oriental Languages >Developing a Core Vocabulary for a Mandarin Chinese AAC System Using Word Frequency Data
【24h】

Developing a Core Vocabulary for a Mandarin Chinese AAC System Using Word Frequency Data

机译:使用词频数据开发普通话AAC系统的核心词汇

获取原文
获取原文并翻译 | 示例
           

摘要

Creating a generative language system in augmentative and alternative communication (AAC) for both literate and non-literate users requires three fundamental design steps: analyzing the grammar of the language; developing an efficient, culturally relevant pictorial representation system for encoding the language; and selecting a list of target words to encode. This paper will focus on the problem of selecting vocabulary for the AAC system. This is an important problem because the selection of the encoded vocabulary will ultimately determine the language coverage of the AAC system. The vocabulary list should provide coverage on three fronts: it should contain the most frequently used vocabulary of the spoken language; it should allow description of all the universal, fundamental semantic concepts; and it should meet users' expectations of what they should be able to say on a daily basis, in any given situation. Mandarin Chinese, as a relatively new target language for AAC, presents some unique challenges to the vocabulary designers. For example, the conceptual differences between the Chinese word and character can be difficult to pin down; this highlights the importance of a consistent and scientific approach to vocabulary selection. This paper describes the research process used in the creation of a core vocabulary list for a new icon-encoded Mandarin Chinese AAC system, starting with a basic corpus-derived word-frequency list, and continuing with the modification and supplementation of this list with other linguistic and anecdotal knowledge.
机译:为识字和非识字用户创建一种在增补和替代沟通(AAC)中生成语言的系统,需要三个基本的设计步骤:分析语言的语法;开发有效的,与文化相关的图形表示系统以对语言进行编码;并选择要编码的目标词列表。本文将重点讨论AAC系统的词汇选择问题。这是一个重要的问题,因为编码词汇的选择最终将决定AAC系统的语言覆盖范围。词汇表应涵盖三个方面:应包含最常使用的口头词汇;它应该允许描述所有通用的基本语义概念;并且在任何给定情况下,它都应该满足用户对他们每天能够说的话的期望。普通话作为AAC的一种较新的目标语言,对词汇设计人员提出了一些独特的挑战。例如,汉字和汉字之间的概念差异可能很难确定。这突出了在词汇选择中采用一致且科学的方法的重要性。本文描述了在为新的图标编码的普通话AAC系统创建核心词汇表时使用的研究过程,该过程首先从基本的语料库得出的词频表开始,然后继续对该表进行修改和补充。语言和传闻知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号