首页> 中文期刊> 《计算机科学》 >MXDR:一种基于关键字的XML多文档分布式检索方法

MXDR:一种基于关键字的XML多文档分布式检索方法

         

摘要

基于关键字的XML检索技术是近几年信息检索领域的研究热点.但是由于关键字缺少XML结构语义信息,检索结果和用户需求偏差较大,检索质量难以提高;而XML结构检索由于用户难以提出准确描述查询意图的查询表达式而难以普及.另一个更突出的问题是现有的XML检索研究绝大多数都集中在单文档上,缺乏实用性.因此提出一种基于关键字的结构检索方法,即用分布式方式实现对多XML文档的检索,简称为MXDR (Multi-XML Distributed Retrieval).MXDR首先用一种兼顾结构和内容的聚类方法对多文档进行分类,通过分析查询关键字和类别结构信息,确定分布查找策略,再结合查询关键字和XML的结构信息,构建结构查询语句,最后通过结构查询系统实现关键字检索.在多组真实数据Sigmod数据集上的验证结果表明,与经典的SLCA方法比较,MXDR方法具有较高的查全率和查准率,尤其在检索效率上MXDR方法有显著优势.%The emergence of the Web has increased interests in XML data, Keyword search has attracted a great deal of attention for retrieving XML data because it is a user-friendly mechanism. But Keyword search is hard to directly improve search quality because lots of keyword-matched nodes may not contribute to the results. A more important issue is the current studies are focused on single XML retrieval,lack of practicability. To address the challenge,this article proposed a new approach for automatically correcting queries over Multi-XML,called MXDR(Multi-XML Distributed Retrieval). We first classed multi-XML documents by a clustering method,and elicited the common structure information. Then generated certifiable structured queries by analyzing the given keywords query and the common structure information of XML datasets. We can evaluate the generated structured queries over the XML data sources with any existing structure search engine. We conducted an experimental study on real-life multi-XML datasets. The experimental results show that MXDR is effective and efficient in supporting structural queries,compared with existing proposals.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号