首页> 外文学位 >Model-based Crawling - An Approach to Design Efficient Crawling Strategies for Rich Internet Applications.
【24h】

Model-based Crawling - An Approach to Design Efficient Crawling Strategies for Rich Internet Applications.

机译:基于模型的爬网-一种为富Internet应用程序设计有效的爬网策略的方法。

获取原文
获取原文并翻译 | 示例

摘要

Rich Internet Applications (RIAs) are a new generation of web applications that break away from the concepts on which traditional web applications are based. RIAs are more interactive and responsive than traditional web applications since RIAs allow client-side scripting (such as JavaScript) and asynchronous communication with the server (using AJAX). Although these are improvements in terms of user-friendliness, there is a big impact on our ability to automatically explore (crawl) these applications. Traditional crawling algorithms are not sufficient for crawling RIAs. We should be able to crawl RIAs in order to be able to search their content and build their models for various purposes such as reverse-engineering, detecting security vulnerabilities, assessing usability, and applying model-based testing techniques. One important problem is designing efficient crawling strategies for RIAs. It seems possible to design crawling strategies more efficient than the standard crawling strategies, the Breadth-First and the Depth-First. In this thesis, we explore the possibilities of designing efficient crawling strategies. We use a general approach that we called Model-based Crawling and present two crawling strategies that are designed using this approach. We show by experimental results that model-based crawling strategies are more efficient than the standard strategies.
机译:富Internet应用程序(RIA)是新一代的Web应用程序,它摆脱了传统Web应用程序所基于的概念。与传统的Web应用程序相比,RIA更具交互性和响应能力,因为RIA允许客户端脚本(例如JavaScript)和与服务器的异步通信(使用AJAX)。尽管这些都是用户友好性方面的改进,但是对我们自动浏览(爬网)这些应用程序的能力有很大影响。传统的爬网算法不足以对RIA进行爬网。我们应该能够抓取RIA,以便能够出于各种目的(例如反向工程,检测安全漏洞,评估可用性以及应用基于模型的测试技术)搜索其内容并构建其模型。一个重要的问题是为RIA设计有效的爬网策略。设计爬网策略似乎比标准爬网策略“广度优先”和“深度优先”更有效。在本文中,我们探索了设计有效爬网策略的可能性。我们使用一种称为基于模型的爬网的通用方法,并介绍使用这种方法设计的两种爬网策略。我们通过实验结果表明,基于模型的爬网策略比标准策略更有效。

著录项

  • 作者

    Dincturk, Mustafa Emre.;

  • 作者单位

    University of Ottawa (Canada).;

  • 授予单位 University of Ottawa (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 164 p.
  • 总页数 164
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号