首页> 外文期刊>ACM transactions on the web >A Model-Based Approach for Crawling Rich Internet Applications
【24h】

A Model-Based Approach for Crawling Rich Internet Applications

机译:基于模型的爬网富Internet应用程序

获取原文
获取原文并翻译 | 示例
           

摘要

New Web technologies, like AJAX, result in more responsive and interactive Web applications, sometimes called Rich Internet Applications (RIAs). Crawling techniques developed for traditional Web applications are not sufficient for crawling RIAs. The inability to crawl RIAs is a problem that needs to be addressed for at least making RIAs searchable and testable. We present a new methodology, called "model-based crawling", that can be used as a basis to design efficient crawling strategies for RIAs. We illustrate model-based crawling with a sample strategy, called the "hypercube strategy". The performances of our model-based crawling strategies are compared against existing standard crawling strategies, including breadth-first, depth-first, and a greedy strategy. Experimental results show that our model-based crawling approach is significantly more efficient than these standard strategies.
机译:像AJAX这样的新Web技术会导致响应性和交互性更高的Web应用程序,有时称为Rich Internet Applications(RIA)。为传统的Web应用程序开发的爬网技术不足以爬网RIA。无法爬网RIA是一个必须解决的问题,至少要使RIA可以搜索和测试。我们提出了一种新的方法,称为“基于模型的爬网”,可以用作设计RIA的有效爬网策略的基础。我们用一个称为“超立方体策略”的样本策略说明了基于模型的爬网。我们将基于模型的爬网策略的性能与现有的标准爬网策略(包括广度优先,深度优先和贪婪策略)进行了比较。实验结果表明,基于模型的爬网方法比这些标准策略效率更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号