...
首页> 外文期刊>IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics >On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies
【24h】

On optimizing syntactic pattern recognition using tries and AI-based heuristic-search strategies

机译:基于尝试和基于AI的启发式搜索策略的优化句法模式识别

获取原文
获取原文并翻译 | 示例
           

摘要

This paper deals with the problem of estimating, using enhanced artificial-intelligence (AI) techniques, a transmitted string X* by processing the corresponding string Y, which is a noisy version of X*. It is assumed that Y contains substitution, insertion, and deletion (SID) errors. The best estimate X+ of X* is defined as that element of a dictionary H that minimizes the generalized Levenshtein distance (GLD) D(X,Y) between X and Y, for all X∈H. In this paper, it is shown how to evaluate D(X,Y) for every X∈H simultaneously, when the edit distances are general and the maximum number of errors is not given a priori, and when H is stored as a trie. A new scheme called clustered beam search (CBS) is first introduced, which is a heuristic-based search approach that enhances the well-known beam-search (BS) techniques used in AI. The new scheme is then applied to the approximate string-matching problem when the dictionary is stored as a trie. The new technique is compared with the benchmark depth-first search (DFS) trie-based technique (with respect to time and accuracy) using large and small dictionaries. The results demonstrate a marked improvement of up to 75% with respect to the total number of operations needed on three benchmark dictionaries, while yielding an accuracy comparable to the optimal. Experiments are also done to show the benefits of the CBS over the BS when the search is done on the trie. The results also demonstrate a marked improvement (more than 91%) for large dictionaries.
机译:本文处理的问题是使用增强的人工智能(AI)技术通过处理相应的字符串Y(它是X *的有声版本)来估计传输的字符串X *。假定Y包含替换,插入和删除(SID)错误。 X *的最佳估计值X +被定义为字典H的元素,该元素使所有X∈H的X和Y之间的广义Levenshtein距离(GLD)D(X,Y)最小。在本文中,说明了当编辑距离是通用的并且没有事先给出最大错误数时,以及当H作为特里存储时,如何同时评估每个X∈H的D(X,Y)。首先介绍了一种称为集群波束搜索(CBS)的新方案,该方案是一种基于启发式的搜索方法,可增强AI中使用的众所周知的波束搜索(BS)技术。当字典作为特里存储时,新方案然后应用于近似字符串匹配问题。将该新技术与使用大小词典的基于基准深度优先搜索(DFS)Trie的技术(在时间和准确性方面)进行了比较。结果表明,相对于三个基准词典所需的操作总数,显着提高了75%,同时产生的精度可与最佳水平相媲美。还进行了实验,以显示在Trie上进行搜索时CBS相对于BS的优势。结果还表明,大型词典的使用率显着提高(超过91%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号