首页> 外文学位 >Layout-Based Substitution Tree Indexing and Retrieval for Mathematical Expressions.
【24h】

Layout-Based Substitution Tree Indexing and Retrieval for Mathematical Expressions.

机译:用于数学表达式的基于布局的替换树索引和检索。

获取原文
获取原文并翻译 | 示例

摘要

We introduce a new system for layout-based indexing and retrieval of mathematical expressions using substitution trees. Substitution trees can efficiently store and find hierarchically-structured data based on similarity. Previously Kolhase and Sucan applied substitution trees to indexing mathematical expressions in operator tree representation (Content MathML) and query-by-expression retrieval. In this investigation, we use substitution trees to index mathematical expressions in symbol layout tree representation (LATEX) to group expressions based on the similarity of their symbols, symbol layout, sub-expressions and size.;We describe our novel substitution tree indexing and retrieval algorithms and our many significant contributions to the behavior of these algorithms, including: allowing substitution trees to index and retrieve layout-based mathematical expressions instead of predicates; introducing a bias in the insertion function that helps group expressions in the index based on similarity in baseline size; modifying the search function to find expressions that are not identical yet still structurally similar to a search query; and ranking search results based on their similarity in symbols and symbol layout to the search query.;We provide an experiment testing our system against the term frequency-inverse document frequency (TF-IDF) keyword-based system of Zanibbi and Yuan and demonstrate that: in many cases, the two systems are comparable; our system excelled at finding expressions identical to the search query and expressions containing relevant sub-expressions; and our system experiences some limitations due to the insertion bias and the presence of LATEX formatting in expressions. Future work includes: designing a different insertion bias that improves the quality of search results; modifying the behavior of the search and ranking functions; and extending the scope of the system so that it can index websites or non-LATEX expressions (such as MathML or images).;Overall, we present a promising first attempt at layout-based substitution tree indexing and retrieval for mathematical expressions.
机译:我们引入了一个新的系统,用于使用替换树基于布局的索引和数学表达式的检索。替换树可以基于相似性来有效地存储和查找分层结构的数据。以前,Kolhase和Sucan将替换树应用于运算符树表示(Content MathML)和按表达式查询检索中的数学表达式索引。在这项研究中,我们使用替换树为符号布局树表示形式(LATEX)中的数学表达式建立索引,以根据它们的符号,符号布局,子表达式和大小的相似性对表达式进行分组。算法以及我们对这些算法的行为的许多重要贡献,包括:允许替换树索引和检索基于布局的数学表达式而不是谓词;在插入函数中引入偏差,该偏差有助于根据基线大小的相似性将索引中的表达式分组;修改搜索功能以查找不相同但在结构上仍与搜索查询相似的表达式;并根据它们在符号和符号布局上与搜索查询之间的相似性对搜索结果进行排名。;我们提供了一个实验,针对基于Zanibbi和Yuan的词频-反文档频率(TF-IDF)关键字系统测试了我们的系统,并证明了:在许多情况下,两个系统是可比较的;我们的系统擅长查找与搜索查询相同的表达式以及包含相关子表达式的表达式;由于插入偏见和表达式中存在LATEX格式,我们的系统受到一些限制。未来的工作包括:设计不同的插入偏好,以提高搜索结果的质量;修改搜索和排名功能的行为;总体上,我们提出了一种有前途的尝试,该尝试是基于布局的替换树索引和数学表达式的检索。

著录项

  • 作者单位

    Rochester Institute of Technology.;

  • 授予单位 Rochester Institute of Technology.;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 M.S.
  • 年度 2011
  • 页码 101 p.
  • 总页数 101
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 公共建筑;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号