A review on techniques for optimizing web crawler results

机译：关于优化Web爬虫结果的技术的回顾

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Now a days Internet is widely used by users to satisfy their information needs. In the exponential growth of web, searching for useful information has become more difficult. Web crawler helps to extract the relevant and irrelevant links from the web. To optimizing this irrelevant links various algorithms and technique are used. Discovering information by using web crawler have certain issues; such as different URLs having the similar text which increase the time complexity of the search, crawler resources are wasted in fetching duplicate pages and larger storage is also required to store these web pages. These are some of the roadblocks in getting optimum results from the crawler. This paper provides a deep study of existing information retrieval techniques (I.R) which would help researchers to retrieve optimum result links and information.

机译：如今，用户已广泛使用Internet来满足他们的信息需求。在网络的指数增长中，搜索有用的信息变得更加困难。 Web搜寻器有助于从Web提取相关和不相关的链接。为了优化此无关的链接，使用了各种算法和技术。使用网络搜寻器发现信息存在某些问题；例如，具有相似文本的不同URL会增加搜索的时间复杂性，因此抓取程序资源会浪费在获取重复页面上，并且还需要更大的存储空间来存储这些网页。这些是从爬虫获得最佳结果的障碍。本文对现有信息检索技术（I.R）进行了深入研究，这将有助于研究人员检索最佳结果链接和信息。

著录项

来源
《2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare》|2016年|1-4|共4页
会议地点 Coimbatore(IN)
作者
Anuja Lawankar; Nikhil Mangrulkar;
展开▼
作者单位

Department of Computer Technology, Yeshwantrao Chavan College of Engineering, Nagpur, Maharashtra, India;

Department of Computer Technology, Yeshwantrao Chavan College of Engineering, Nagpur, Maharashtra, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Crawlers; Search engines; Web pages; Algorithm design and analysis; Uniform resource locators; Indexes;

机译：爬网程序；搜索引擎；网页；算法设计和分析；统一资源定位器；索引;

相似文献

外文文献
中文文献
专利

1. Enhancing the security of patients' portals and websites by detecting malicious web crawlers using machine learning techniques [J] . Hosseini Nafiseh, Fakhar Fatemeh, Kiani Behzad, International journal of medical informatics . 2019,第Deca期

机译：通过使用机器学习技术检测恶意Web爬虫来增强患者门户和网站的安全性
2. Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources [J] . Cybernetics and information technologies: CIT . 2019,第2期

机译：优化的聚焦Web爬虫，基于自然语言处理的基于生物信息学网源的相关性测量
3. Novel method for industrial sewage outfall detection: Water pollution monitoring based on web crawler and remote sensing interpretation techniques [J] . Zhang Jing, Zou Tianyuan, Lai Yuequn Journal of Cleaner Production . 2021,第Auga20期

机译：工业污水排污口检测的新方法：基于Web履带的水污染监测和遥感解释技巧
4. A review on techniques for optimizing web crawler results [C] . Anuja Lawankar, Nikhil Mangrulkar World Conference on Futuristic Trends in Research and Innovation for Social Welfare . 2016

机译：优化Web履带效果的技术综述
5. Web Design with Search Engine Optimization Techniques and Web Intelligence. [D] . Hui, Chun Keung. 2012

机译：使用搜索引擎优化技术和Web Intelligence进行Web设计。
6. Using caching and optimization techniques to improve performance of the Ensembl website [O] . Anne Parker, Eugene Bragin, Simon Brent, 2010

机译：使用缓存和优化技术来改善Ensembl网站的性能
7. A Novel Technique for Spare Web Page Detection in Parallel Web Crawler [O] . Gaurav Kumar Srivastav, Irphan Ali 2015

机译：并行Web爬虫中备用Web页面检测的一种新技术

A review on techniques for optimizing web crawler results

摘要

著录项

相似文献

相关主题

期刊订阅