【24h】

Unsupervised Domain Ontology Learning from Text

机译:无监督领域文本学习

获取原文

摘要

Construction of Ontology is indispensable with rapid increase in textual information. Much research in learning Ontology are supervised and require manually annotated resources. Also, quality of Ontology is dependent on quality of corpus which may not be readily available. To tackle these problems, we present an iterative focused web crawler for building corpus and an unsupervised framework for construction of Domain Ontology. The proposed framework consists of five phases, Corpus Collection using Iterative Focused crawling with novel weighting measure, Term Extraction using HITS algorithm, Taxonomic Relation Extraction using Hearst and Morpho-Syntactic Patterns, Non Taxonomic relation extraction using association rule mining and Domain Ontology Building. Evaluation results show that proposed crawler outweighs traditional crawling techniques, domain terms showed higher precision when compared to statistical techniques and learnt ontology has rich knowledge representation.
机译:文本信息的快速增长是构建本体不可缺少的。对学习本体的许多研究都受到监督,并且需要人工注释的资源。同样,本体论的质量取决于语料库的质量,而语料库的质量可能不容易获得。为了解决这些问题,我们提出了一种用于构建语料库的迭代式专注于Web的爬网程序,以及一种用于构建Domain Ontology的无监督框架。所提出的框架包括五个阶段:使用具有新颖权重度量的迭代聚焦爬取的语料库收集,使用HITS算法的术语提取,使用赫斯特和词法句法模式的分类关系提取,使用关联规则挖掘的非分类关系提取和领域本体构建。评估结果表明,提出的搜寻器胜过传统的搜寻技术,与统计技术相比,领域术语显示出更高的精度,而学习型本体具有丰富的知识表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号