首页> 外文期刊>IEEE transactions on industrial informatics >Self-Adaptive Semantic Focused Crawler for Mining Services Information Discovery
【24h】

Self-Adaptive Semantic Focused Crawler for Mining Services Information Discovery

机译:用于采矿服务信息发现的自适应语义爬虫

获取原文
获取原文并翻译 | 示例
           

摘要

It is well recognized that the Internet has become the largest marketplace in the world, and online advertising is very popular with numerous industries, including the traditional mining service industry where mining service advertisements are effective carriers of mining service information. However, service users may encounter three major issues – heterogeneity, ubiquity, and ambiguity, when searching for mining service information over the Internet. In this paper, we present the framework of a novel self-adaptive semantic focused crawler – SASF crawler, with the purpose of precisely and efficiently discovering, formatting, and indexing mining service information over the Internet, by taking into account the three major issues. This framework incorporates the technologies of semantic focused crawling and ontology learning, in order to maintain the performance of this crawler, regardless of the variety in the Web environment. The innovations of this research lie in the design of an unsupervised framework for vocabulary-based ontology learning, and a hybrid algorithm for matching semantically relevant concepts and metadata. A series of experiments are conducted in order to evaluate the performance of this crawler. The conclusion and the direction of future work are given in the final section.
机译:众所周知,互联网已成为世界上最大的市场,在线广告在众多行业中非常流行,包括传统的采矿服务行业,在该行业中,采矿服务广告是采矿服务信息的有效载体。但是,在Internet上搜索服务信息时,服务用户可能会遇到三个主要问题-异构性,泛在性和歧义性。在本文中,我们提出了一种新型的自适应语义聚焦爬虫-SASF爬虫的框架,其目的是通过考虑到三个主要问题,在Internet上准确有效地发现,格式化和索引挖掘服务信息。此框架结合了语义集中的爬网和本体学习技术,以保持此爬网程序的性能,而不管Web环境如何变化。这项研究的创新在于为基于词汇的本体学习设计了一种无监督的框架,以及一种用于匹配语义相关概念和元数据的混合算法。为了评估该履带的性能,进行了一系列实验。最后一部分给出了结论和未来工作的方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号