首页> 外文期刊>Information Processing & Management >Descendants, ancestors, children and parent: A set-based approach to efficiently address XPath primitives
【24h】

Descendants, ancestors, children and parent: A set-based approach to efficiently address XPath primitives

机译:后代,祖先,子女和父母:一种基于集合的方法来有效地处理XPath原语

获取原文
获取原文并翻译 | 示例
           

摘要

XML is a pervasive technology for representing and accessing semi-structured data. XPath is the standard language for navigational queries on XML documents and there is a growing demand for its efficient processing. In order to increase the efficiency in executing four navigational XML query primitives, namely descendants, ancestors, children and parent, we introduce a new paradigm where traditional approaches based on the efficient traversing of nodes and edges to reconstruct the requested subtrees are replaced by a brand new one based on basic set operations which allow us to directly return the desired subtree, avoiding to create it passing through nodes and edges. Our solution stems from the NEsted SeTs for Object hieRarchies (NEASTOR) formal model, which makes use of set-inclusion relations for representing and providing access to hierarchical data. We define in-memory efficient data structures to implement NESTOR, we develop algorithms to perform the descendants, ancestors, children and parent query primitives and we study their computational complexity. We conduct an extensive experimental evaluation by using several datasets: digital archives (EAD collections), INEX 2009 Wikipedia collection, and two widely-used synthetic datasets (XMark and XGen). We show that NESTOR-based data structures and query primitives consistently outperform state-of-the-art solutions for XPath processing at execution time and they are competitive in terms of both memory occupation and pre-processing time.
机译:XML是用于表示和访问半结构化数据的普遍技术。 XPath是用于XML文档导航查询的标准语言,并且对其高效处理的需求不断增长。为了提高执行四个导航XML查询原语(即后代,祖先,子代和父代)的效率,我们引入了一种新的范例,其中基于节点和边的有效遍历以重构请求的子树的传统方法被品牌代替基于基本集合操作的新树,它使我们可以直接返回所需的子树,而避免创建它穿过节点和边的情况。我们的解决方案源于用于对象层次结构的NEsted SeT(NEASTOR)形式模型,该模型使用集合包含关系来表示和提供对分层数据的访问。我们定义内存有效的数据结构以实现NESTOR,开发算法以执行后代,祖先,子代和父查询原语,并研究它们的计算复杂性。我们使用多个数据集进行了广泛的实验评估:数字档案馆(EAD集合),INEX 2009 Wikipedia集合和两个广泛使用的合成数据集(XMark和XGen)。我们显示,基于NESTOR的数据结构和查询原语在执行时始终优于XPath处理的最新解决方案,并且在内存占用和预处理时间方面都具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号