首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >Detection of Fusion Genes from Human Breast Cancer Cell-line RNA-Seq Data Using Shifted Short Read Clustering
【24h】

Detection of Fusion Genes from Human Breast Cancer Cell-line RNA-Seq Data Using Shifted Short Read Clustering

机译:使用移位的短读聚类检测人乳腺癌细胞系RNA-SEQ数据的融合基因

获取原文

摘要

Fusion genes make for one of the mechanisms of tumorigenesis. The identification of fusion genes by RNA-Seq has attracted attention. Various methods for detecting fusion genes have been proposed, but their accuracy is not sufficient. One of the causes of this problem is the relatively short reading length in RNA-Seq data. Therefore, before mapping RNA-Seq data, we proposed a method, which is based on shifted short-read clustering (SSC), to identify shifted reads of the same origin and extend them as representative sequences. As a result, we assumed that the percentage of uniquely mapped reads would be increased, and the detection rates of the fusion genes could be improved. To verify these hypotheses, we applied the SSC method to RNA-Seq data from three cell-lines (BT-474, MCF-7, and SKBR-3). When only one base was shifted, the average read lengths of BT-474, MCF-7, and SKBR-3 were extended from 201 to 223 bases (111%), 201 to 214 bases (106%), and 201 to 213 bases (106%), respectively. Furthermore, the effectiveness of the SSC method is demonstrated by comparing the performances of a fusion gene detection tool's results, STAR-Fusion, with and without the SSC method of the reads. The percentage of uniquely mapped reads of BT-474, MCF-7, and SKBR-3 were improved from 88% to 93%, 88% to 94%, and 92% to 95%, respectively. Finally, the fusion gene detection rates of BT-474, MCF-7, and SKBR-3 were increased from 48% to 57%, 49% to 53%, and 50% to 53% respectively. The SSC method is considered to be an effective method not only for improving the percentage of uniquely mapped reads but also for fusion gene detection.
机译:融合基因为肿瘤发生机制之一。 RNA-SEQ鉴定融合基因引起了注意力。已经提出了检测融合基因的各种方法,但它们的准确性是不够的。该问题的原因之一是RNA-SEQ数据中的相对较短的读数长度。因此,在映射RNA-SEQ数据之前,我们提出了一种基于移位的短读聚类(SSC)的方法,以识别相同原点的移位读取并将其扩展为代表序列。结果,我们假设将增加唯一映射读数的百分比,并且可以提高融合基因的检测速率。为了验证这些假设,我们将SSC方法应用于来自三个单元线的RNA-SEQ数据(BT-474,MCF-7和SKBR-3)。当仅移位一个碱时,BT-474,MCF-7和SKBR-3的平均读取长度从201到223个碱基(111%),201至214个碱基(106%)和201到213个碱基(106%)分别。此外,通过比较融合基因检测工具的结果,恒星融合的性能,有和没有读取的SSC方法,证明了SSC方法的有效性。 BT-474,MCF-7和SKBR-3独特映射读数的百分比从88%增加到93%,88%至94%,分别为92%至95%。最后,BT-474,MCF-7和SKBR-3的融合基因检测速率从48%增加到57%,49%至53%,分别为50%至53%。 SSC方法被认为是不仅有效的方法,不仅用于改善唯一映射读数的百分比,而且还用于融合基因检测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号