首页> 外文学位 >Structured, unstructured, and semistructured search in semistructured databases.
【24h】

Structured, unstructured, and semistructured search in semistructured databases.

机译:半结构化数据库中的结构化,非结构化和半结构化搜索。

获取原文
获取原文并翻译 | 示例

摘要

A single framework for storing and querying XML data, using denormalized schema decompositions, can support both structured queries and unstructured searches, as well as serve as a foundation for combining the two forms of information access.; XML data format becomes increasingly popular in applications that mix structured data and unstructured text. These applications require integration of structured query and text search mechanisms to access XML data.; First, we introduce a framework for storing and querying XML data using denormalized schema decompositions. This framework was initially implemented in the XCacheDB XML database system, which uses XML schemas to shred XML data into relational storage. The XCacheDB supports a subset of XQuery language and emphasizes query optimization to reduce latency and output first results quickly.; The XCacheDB relies on XML schemas, which poses a novel challenge for validation XML updates. We investigate the incremental validation of XML documents with respect to DTDs and XML Schemas. We exhibit an O( m log n) algorithm using an auxiliary structure of size O(n), where n is the size of the document and m is the number of updates. We exhibit a restricted class of DTDs called "local" that arise commonly in practice and for which incremental validation can be done in practically constant time by maintaining only a list of counters. We present implementations and experimental evaluations of both general incremental validation and local validation in the XCacheDB system.; We, then, present XKeyword system which uses a variation of XCacheDB of schema decompositions to support keyword proximity searches in XML databases. XKeyword decompositions include ID relations which store of IDs of target objects, and pre-compute common joins.; Finally, we present an architecture of the Semi-Structured Search System (S4) designed to bridge the gap between traditional database and information retrieval systems. S4 QL query language combines features of structured queries and text search to facilitate information discovery without knowledge of schema. S4 is based on the same schema decomposition framework of XCacheDB and XKeyword. However, the combination structured and unstructured query features pose novel challenges to efficient query processing. We outline these issues and possible ways of addressing them.
机译:使用非规范化模式分解的用于存储和查询XML数据的单一框架可以支持结构化查询和非结构化搜索,并且可以作为组合两种形式的信息访问的基础。 XML数据格式在混合使用结构化数据和非结构化文本的应用程序中变得越来越流行。这些应用程序需要集成结构化查询和文本搜索机制来访问XML数据。首先,我们介绍一个使用非规范化模式分解来存储和查询XML数据的框架。该框架最初是在XCacheDB XML数据库系统中实现的,该系统使用XML模式将XML数据切入关系存储。 XCacheDB支持XQuery语言的子集,并强调查询优化以减少延迟并快速输出第一结果。 XCacheDB依赖于XML模式,这对验证XML更新提出了新的挑战。我们研究了有关DTD和XML Schema的XML文档的增量验证。我们展示了一种使用大小为O(n)的辅助结构的O(m log n)算法,其中n是文档的大小,m是更新的数量。我们展示了一种受限的DTD(称为“本地”),它在实践中很常见,对于这些DTD,仅维护一个计数器列表就可以在几乎恒定的时间内完成增量验证。我们介绍了XCacheDB系统中一般增量验证和本地验证的实现和实验评估。然后,我们将介绍XKeyword系统,该系统使用XCacheDB模式分解的变体来支持XML数据库中的关键字邻近搜索。 XKeyword分解包括存储目标对象的ID的ID关系和预先计算的公共联接。最后,我们提出了一种半结构搜索系统(S4)的体系结构,该体系结构旨在弥合传统数据库和信息检索系统之间的差距。 S4 QL查询语言结合了结构化查询和文本搜索的功能,以在不了解架构的情况下促进信息发现。 S4基于XCacheDB和XKeyword的相同模式分解框架。但是,结构化和非结构化查询特征的组合给有效的查询处理提出了新的挑战。我们概述了这些问题以及解决这些问题的可能方法。

著录项

  • 作者

    Balmin, Andrey.;

  • 作者单位

    University of California, San Diego.;

  • 授予单位 University of California, San Diego.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 195 p.
  • 总页数 195
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号