首页> 外国专利> HOMOLOGY RETRIEVAL SYSTEM, HOMOLOGY RETRIEVAL APPARATUS, AND HOMOLOGY RETRIEVAL METHOD

HOMOLOGY RETRIEVAL SYSTEM, HOMOLOGY RETRIEVAL APPARATUS, AND HOMOLOGY RETRIEVAL METHOD

机译:均相检索系统,均相检索装置和均相检索方法

摘要

A homology retrieval can be performed with higher accuracy than conventional technologies when comparing a query sequence with a target sequence, and retrieving a similar location in the target sequence. The sequence information of a query sequence and a genomic-scale target sequence is acquired, the acquired information is compressingly converted into a compressed query sequence and a compressed target sequence in each of which a homopolymer region including two or more consecutive identical bases is replaced with a single base of the bases, the two sequences are compared, and a refining search is performed for a compressed target partial sequence that matches the compressed query sequence in the compressed target sequence. For the refined compressed candidate sequence and the query sequence, based on the information on the number of consecutive identical bases in the each of the sequences before compression, the number of consecutive bases is compared between the two compressed sequences for each corresponding base, and the degree of similarity indicating homology of the candidate sequence with the query sequence is computed from a degree of match or a degree of mismatch in the number of consecutive bases. By ranking and selecting an arbitrary number of candidate sequences having relatively high homology with the query sequence from this degree of similarity, it is possible to avoid the influence of the number of consecutive identical bases in a homopolymer region, thereby performing a homology retrieval accurately.
机译:当将查询序列与目标序列进行比较并检索目标序列中的相似位置时,与传统技术相比,同源检索的准确性更高。获取查询序列和基因组规模靶序列的序列信息,将获取的信息压缩转换为压缩查询序列和压缩靶序列,用两个或更多个连续的相同碱基的均聚物区域替换为压缩查询序列和压缩目标序列。在一个碱基的基础上,比较两个序列,并对与压缩目标序列中的压缩查询序列匹配的压缩目标部分序列进行细化搜索。对于精炼的压缩候选序列和查询序列,基于压缩之前每个序列中连续相同碱基数的信息,比较两个压缩序列中每个对应碱基的连续碱基数,根据连续碱基数的匹配度或不匹配度,计算出表示候选序列与查询序列的同源性的相似度。通过从该相似度对与查询序列具有较高同源性的任意数量的候选序列进行排序和选择,可以避免均聚物区域中连续相同碱基的数目的影响,从而精确地进行同源性检索。

著录项

  • 公开/公告号IN2009KN03416A

    专利类型

  • 公开/公告日2009-12-18

    原文格式PDF

  • 申请/专利权人

    申请/专利号IN3416/KOLNP/2009

  • 申请日2009-09-30

  • 分类号G06F19/00;G06F17/30;

  • 国家 IN

  • 入库时间 2022-08-21 18:46:11

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号