首页> 外国专利> Correspondence making attach the click URL which, is chosen the search central processing unit which does the seed choice of the crawler for specialty search making use of the click

Correspondence making attach the click URL which, is chosen the search central processing unit which does the seed choice of the crawler for specialty search making use of the click

机译:对应于附加点击URL,点击URL被选择为搜索中央处理单元,该搜索中央处理单元利用点击来进行爬虫的种子选择以进行特殊搜索

摘要

PROBLEM TO BE SOLVED: To provide a retrieval processor, retrieval processing method and program for effectively collecting Web pages on a specific special field and suppressing the Web pages from being failed to be collected.;SOLUTION: A retrieval processor 20 extracts a special field click log, and extracts an authority page which is suitable for retrieval of the special field on the basis of a frequency of clicking a click URL included in the extracted special field click log. In addition, a back link and/or a forward link to the extracted authority page is searched for, and a directed graph is generated by using the extracted authority page as a node and the searched back link and/or the forward link as a directed side. Moreover, the score of the authority page which is each node of the directed graph is calculated, and when the calculated score is a predetermined value or more, the authority page for which the score is calculated is determined to be a hub page for crawling a retrieval target in the predetermined special field.;COPYRIGHT: (C)2010,JPO&INPIT
机译:解决的问题:提供一种检索处理器,检索处理方法和程序,用于有效地收集特定特殊领域中的网页并抑制不能被收集的网页。解决方案:检索处理器20提取特定领域的点击日志,并根据单击提取的特殊字段单击日志中包含的单击URL的频率,提取适合检索特殊字段的权限页面。另外,搜索到所提取的权限页面的反向链接和/或前向链接,并且通过使用所提取的权限页面作为节点并且以所搜​​索的后向链接和/或前向链接作为有向链接来生成有向图。侧。此外,计算作为有向图的每个节点的权限页面的分数,并且当所计算的分数是预定值或更大时,将为其计算分数的权限页面确定为用于爬网的中心页面。预定领域中的检索目标。;版权所有:(C)2010,JPO&INPIT

著录项

  • 公开/公告号JP4824070B2

    专利类型

  • 公开/公告日2011-11-24

    原文格式PDF

  • 申请/专利权人 ヤフー株式会社;

    申请/专利号JP20080281481

  • 发明设计人 藤田 澄男;

    申请日2008-10-31

  • 分类号G06F17/30;

  • 国家 JP

  • 入库时间 2022-08-21 17:36:13

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号