【24h】

Hindi Word Sketches

机译:不是单词草图

获取原文

摘要

Word sketches are one-page automatic, corpus-based summaries of a word's grammatical and collocational behaviour. These are widely used for studying a language and in lexicography. Sketch Engine is a leading corpus tool which takes as input a corpus and generates word sketches for the words of that language. It also generates a thesaurus and 'sketch differences', which specify similarities and differences between near-synonyms. In this paper, we present the functionalities of Sketch Engine for Hindi. We collected HindiWaC, a web crawled corpus for Hindi with 240 million words. We lemmatized, POS tagged the corpus and then loaded it into Sketch Engine.
机译:单词草图是单词的语法和搭配行为的单页自动,基于语料库的摘要。这些被广泛用于学习语言和词典编纂。 Sketch Engine是一种领先的语料库工具,可将语料库作为输入并为该语言的单词生成单词素描。它还会生成一个词库和“速写差异”,用于指定近义词之间的相似性和差异。在本文中,我们介绍了适用于印地语的Sketch Engine的功能。我们收集了HindiWaC,这是一个针对Hindi的网络抓取语料库,包含2.4亿个单词。我们进行了定形,POS标记了语料库,然后将其加载到Sketch Engine中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号