首页> 外文学位 >Graph-based data analysis: Tree-structured covariance estimation, prediction by regularized kernel estimation and aggregate database query processing for probabilistic inference.
【24h】

Graph-based data analysis: Tree-structured covariance estimation, prediction by regularized kernel estimation and aggregate database query processing for probabilistic inference.

机译:基于图的数据分析:树状协方差估计,通过正则核估计进行预测以及用于概率推断的聚合数据库查询处理。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation presents a collection of computational techniques for the analysis of data where relationships between objects can be expressed through a graph. Data of this type can be found in many and diverse settings, including genomic and epidemiological applications, web search, social networking and decision making. Although taking relationships into account makes analysis of this type of data more challenging, the graph structure of these relationships can be used to make this analysis viable. In this dissertation, we implement a number of techniques for analyzing this type of data using well-known and tested computational tools. Furthermore, we explore these techniques over a wide array of biological and decision making applications.;In Part I, we present a method for estimating tree-structured covariance matrices directly from observed continuous data. Tree-structured covariance matrices encode probabilistic relationships between objects that can be described by rooted trees. In this case, we directly estimate graph structure from observed data under a specific probabilistic model.;Part II presents a methodology for graph-based prediction where a predictive model is estimated over data where relationships between objects are encoded by a known graph. We make extensive use of Regularized Kernel Estimation (Lu et al., 2005), a framework for estimating a positive semidefinite kernel from noisy, incomplete and inconsistent distance data. In this case, the graph structure of the data is used to define a distance from which a kernel matrix is estimated.;Finally, in Part III, we present techniques for efficiently evaluating aggregate queries of a particular type over views defining a large number of database records. The main assumption is that this view is the result of a stylized join over a number of much smaller tables, and is described by a graph. We make use of this graph structure to reduce the cost of single query evaluation and to cache intermediate results in a query workload setting. This framework was designed in part to address scalable probabilistic inference in relational databases.
机译:本文提出了一种用于数据分析的计算技术集合,其中对象之间的关系可以通过图形表示。可以在许多不同的环境中找到此类数据,包括基因组和流行病学应用程序,Web搜索,社交网络和决策。尽管考虑到关系使这种类型的数据分析更具挑战性,但是这些关系的图结构可用于使这种分析可行。在本文中,我们使用众所周知且经过测试的计算工具,实现了多种技术来分析此类数据。此外,我们在广泛的生物学和决策应用中探索了这些技术。在第一部分中,我们提出了一种直接从观察到的连续数据中估计树状结构协方差矩阵的方法。树状结构的协方差矩阵编码可以由有根树描述的对象之间的概率关系。在这种情况下,我们直接根据特定概率模型下的观测数据来估计图结构。第二部分介绍了基于图的预测的方法,其中预测模型是对数据进行估计的,其中对象之间的关系由已知图编码。我们广泛使用正则化核估计(Lu等人,2005),这是一个根据嘈杂,不完整和不一致的距离数据估计正半定核的框架。在这种情况下,数据的图结构用于定义估计内核矩阵的距离。最后,在第三部分中,我们提出了用于在定义大量数据的视图上有效评估特定类型的聚合查询的技术。数据库记录。主要假设是,此视图是在许多小得多的表上进行风格化联接的结果,并由图形描述。我们利用此图结构来减少单个查询评估的成本,并将中间结果缓存在查询工作负载设置中。设计此框架的一部分是为了解决关系数据库中的可扩展概率推断。

著录项

  • 作者

    Bravo, Hector Corrada.;

  • 作者单位

    The University of Wisconsin - Madison.;

  • 授予单位 The University of Wisconsin - Madison.;
  • 学科 Statistics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 186 p.
  • 总页数 186
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号