首页> 外文会议>Mathematical methods for information science and economics >Petri Nets Saliency Models of Multiple Biological Sequences
【24h】

Petri Nets Saliency Models of Multiple Biological Sequences

机译:多种生物序列的Petri网显着性模型

获取原文
获取原文并翻译 | 示例

摘要

In medicine, computational biology, pattern recognition, string editing and data compression, to name a few research areas, large amounts of data are extracted, pre-processed, selected and classified in order to perform the diagnosis. The classical visual analysis of such a quantity of data is no longer possible, therefore computers are involved in this process and thus, automated systems that recognize biological features have been in use for several years. There is a strong demand for the development of such automated algorithms and devices, due to the improved video and biological computation techniques that avoid the possibility of the analyst missing/misreading information. Within heuristic approaches there are a number of methods for identifying important input features. Such methods are considered saliency ones mainly due to the fact that they can intuitively model and simulate the mechanisms and signals used in computational biology. Recently, the general problem of selecting a parsimonious salient feature set for computational biology has retained a great deal of interest. One may notice that non-salient features may reduce the diagnosis accuracy and even make it a NP-hard problem considering that, as the number of features grows, the number of training samples grows exponentially. In order to reduce the size of the extracted input feature samples we focus on determining the longest common subsequence (LCS) for a set of multiple string-sequences in an operation for a wide range of applications in the areas mentioned above. This presentation is focused on the improvement of the automated medical diagnosis based on biological feature (BF) selection and classification, as we know that biological features represent patterns of important information. Medical diagnostic can be improved if the pattern is comprised by most of the significant biological features. In our study, common sequence measures were employed to determine the saliency of a wide range of applications in the area of medicine, computational biology, as well as string editing, pattern recognition and genetics etc. We assume that an important common sequence salience measure is to find the longest common subsequence (LCS) for a set of n sequences. In order to perform this hard task, we use discrete event formalism, respectively Petri nets and we propose an algorithm for reducing the size of the digraphs. An interesting application to the ECG signals will demonstrate that salient input features effectively aid the diagnosis process.
机译:在医学,计算生物学,模式识别,字符串编辑和数据压缩等领域,仅举几个研究领域,就对大量数据进行提取,预处理,选择和分类以进行诊断。这样的数据量的经典视觉分析不再可能,因此计算机参与了此过程,因此,识别生物学特征的自动化系统已经使用了几年。由于改进的视频和生物计算技术避免了分析人员丢失/误读信息的可能性,因此强烈需要开发这种自动化算法和设备。在启发式方法中,有许多方法可以识别重要的输入特征。此类方法之所以被视为显着方法,主要是因为它们可以直观地建模和模拟计算生物学中使用的机制和信号。最近,选择用于计算生物学的简约显着特征集的一般问题引起了人们的极大兴趣。考虑到随着特征数量的增加,训练样本的数量呈指数增长,可能会注意到非显着特征可能会降低诊断准确性,甚至使其成为NP难题。为了减小提取的输入特征样本的大小,我们专注于为上述区域中广泛应用的操作确定一组多个字符串序列的最长公共子序列(LCS)。由于我们知道生物学特征代表了重要信息的模式,因此本演示着重于基于生物学特征(BF)选择和分类的自动化医学诊断的改进。如果该模式包含大多数重要的生物学特征,则可以改善医学诊断。在我们的研究中,采用通用序列显着性来确定在医学,计算生物学以及字符串编辑,模式识别和遗传学等领域的广泛应用的显着性。我们假设重要的通用序列显着性方法是查找一组n个序列的最长公共子序列(LCS)。为了执行此艰巨的任务,我们分别使用离散事件形式主义和Petri网,并提出了一种用于减小图的大小的算法。 ECG信号的一个有趣应用将证明显着的输入功能可以有效地帮助诊断过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号