首页> 中文期刊> 《计算机技术与发展》 >西安市数字方志全文检索系统的设计与实现

西安市数字方志全文检索系统的设计与实现

         

摘要

In the paper.it implements the first index in PDF document by Lucene API. In order to locate the search keyword more accurately , this paper designs and implements a new algorithm for the second index. It contains the information about the keywords' page number, coordinates,context and so on. Which can be made used of locating the retrieval results in the specific page of the book and marking the specific positions of the keywords. Thus, the effect of the second retrieval in PDF document is as similar as Google Book. The test result proved that this system is provided with high retrieval performance, recall rate and precision rate. It can be satisfied with the requirement of quickly retrieving websites' documents. This system has been using for 2 years as the full-text retrieval system for Xi' an data chorography and it gets lots of application fruit.%通过Lucene API实现对PDF文档的一次全文检索,为了更精确地定位搜索关键词,设计并实现了一种新的二次索引算法,该二次索引带有关键词的页码、坐标及其上下文等信息.利用该二次索引可将检索结果定位到PDF文档的具体页,然后在页面上标示出关键字的具体位置,使对PDF文档的二次检索达到了类似Google Book的图书检索效果.系统测试结果说明系统具有良好检索性能,有较高的查全率和查准率,能够满足用户快速检索的需求.系统作为西安市数字方志全文检索平台投入使用已有2年,取得了较好的应用成果.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号