【24h】

Web Crawling

机译:网络爬行

获取原文
           

摘要

This is a survey of the science and practice of web crawling. While at first glance web crawling may appear to be merely an application of breadth-first-search, the truth is that there are many challenges ranging from systems concerns such as managing very large data structures to theoretical questions such as how often to revisit evolving content sources. This survey outlines the fundamental challenges and describes the state-of-the-art models and solutions. It also highlights avenues for future work.
机译:这是对网络爬网的科学和实践的调查。乍一看,Web爬网似乎只是广度优先搜索的一种应用,但事实是,存在许多挑战,从系统问题(例如管理非常大的数据结构)到理论问题(例如多久重新访问不断发展的内容)资料来源。该调查概述了基本挑战,并描述了最新的模型和解决方案。它还强调了未来工作的途径。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号