【24h】

Corpus Research and its Standardization

机译:语料库研究及其标准化

获取原文
获取原文并翻译 | 示例

摘要

In this paper, the author briefly reviews the development of corpus research abroad. Then he introduces in detail the development and present situation of corpus linguistics in China; earlier corpus, large-scale & authentic text corpus, national corpus, speech corpus, bilingual corpus and corpus of minority languages in China. The various processing techniques for corpus are also introduced: automatic word segmentation of Chinese text, automatic POS tagging, automatic tagging of phrase structure and automatic alignment of bilingual corpus. This paper is a bird's -eye view of corpus linguistics of China. At last, the author discusses several problems in present corpus research: standardization of corpus specifications, commonly sharing of language resources, knowledge properties, etc.
机译:在本文中,作者简要回顾了国外语料库研究的发展。然后详细介绍了中国语料库语言学的发展和现状。中国的早期语料库,大规模真实的语料库,国家语料库,语音语料库,双语语料库和少数民族语言语料库。还介绍了语料库的各种处理技术:中文文本的自动分词,自动POS标记,短语结构的自动标记和双语语料库的自动对齐。本文是中国语料库语言学的鸟瞰图。最后,作者讨论了当前语料库研究中的几个问题:语料库规范的标准化,语言资源的共同共享,知识属性等。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号