【24h】

PaSE: Locating Online Copy of Scientific Documents Effectively

机译:PaSE:有效地查找科学文献的在线副本

获取原文
获取原文并翻译 | 示例

摘要

The need for fast and vast dissemination of research results has led a new trend such that more number of authors post their documents to personal or group Web spaces so that others can easily access and download them. Similarly, more and more researchers use online search for accessing documents of interest in Web, instead of paying a visit to libraries. Currently, to locate and download an online copy of a particular document D, one typically (1) uses Search Engines with the citation information and browses through returned web pages (e.g., author's homepage) to see if any contains D, or (2) uses searching facilities of an individual Digital Library (e.g., CiteSeer, e-Print) looking for D, and if not found, repeats the search in another Digital Library. However, the scheme (1) involves human browsing to get to the final online copy, while the scheme (2) suffers from incomplete coverage. To remedy these shortcomings, in this paper, we present a system, named as PaSE, which can effectively locate online copies (e.g., PDF or PS) of scientific documents using citation information. We consider a myriad of alternatives in crawling and parsing the Web to arrive at the right document quickly, and present a preliminary experimental study. Using some of the best alternatives that we have identified, we show that PaSE can locate online copy of documents more accurately and conveniently than human users would do at the cost of elongated search time.
机译:对研究结果的快速,广泛传播的需求导致了一种新趋势,即越来越多的作者将其文档发布到个人或小组Web空间,以便其他人可以轻松地访问和下载它们。同样,越来越多的研究人员使用在线搜索来访问Web中感兴趣的文档,而不是访问图书馆。当前,要查找和下载特定文档D的在线副本,通常(1)使用带有引文信息的搜索引擎并浏览返回的网页(例如作者的主页)以查看是否包含D,或者(2)使用单个数字图书馆(例如CiteSeer,e-Print)的搜索工具查找D,如果找不到,则在另一个数字图书馆重复搜索。但是,方案(1)涉及人工浏览以获取最终的在线副本,而方案(2)的覆盖范围不完整。为了弥补这些缺点,在本文中,我们提出了一个名为PaSE的系统,该系统可以使用引用信息有效地定位科学文档的在线副本(例如PDF或PS)。我们在爬网和解析Web以快速找到正确的文档时考虑了多种选择,并提供了初步的实验研究。使用我们已经确定的一些最佳替代方法,我们证明,与人类用户相比,PaSE可以以更长的搜索时间为代价,更加准确,便捷地找到文档的在线副本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号