首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Ribosomal in-frame mis-translation of stop codons in multiple open reading frames of specific human long non-coding RNAs.
【24h】

Ribosomal in-frame mis-translation of stop codons in multiple open reading frames of specific human long non-coding RNAs.

机译:特定人类长非编码RNA的多个开放阅读框中的终止密码子的核糖体读框错误翻译。

获取原文

摘要

One of the major discoveries of the early post-genomic era, as embodied by the gene catalogs of the FANTOM and ENCODE (Encyclopedia of DNA Elements) consortia, is that two-thirds of human genes do not encode known proteins. Those ~40,000 non-protein-coding (non-coding RNA) human genes (www.gencodegenes.org) remain poorly understood. Long non-coding RNA (lncRNA) genes are the most numerous category of human ncRNA genes. Hundreds of lncRNAs have recently-discovered functions and are now understood to be fundamental nuclear and cytoplasmic, epigenetic and post-transcriptional, positive and negative regulators of gene expression in normal cellular functions and in a wide range of human diseases. However, the functions, if any, of the vast majority of lncRNAs remain obscure. Significantly, an unconventional role for their transcripts as unexpected de-facto messenger RNAs has not been formally excluded. Ribosome profiling (Riboseq) predicts translational potential; nonetheless, without independent evidence of proteins matching lncRNA open reading frames (ORFs), ribosome binding does not prove translation. We were the first to mass-spectrometrically document translation of specific lncRNAs in human cells (https://genome.cshlp.org/content/22/9/1646.long). We have now performed a global search for lncRNA translation in human MCF7 breast cancer cells, integrating strand-specific RNAseq, Riboseq, and deep mass spectrometry of trypsin-digested <; 15kDa fractions post-UHPLC (Orbitrap MS/MS) in biological quadruplicates by two independent core facilities. We excluded known-protein matches. UCSC Genome Browser-assisted manual annotation of imperfect (tryptic-digest-peptides)-to-(lncRNA-three-frame- translations) alignments initially revealed three peptides hypothetically explicable by “stop-to-nonstop” in- frame replacement of stop codons by amino acids in two ORFs of the lncRNA MMP24-AS1. To search for this phenomenon genomewide, we designed and implemented an unprecedented computational pipeline, matching tryptic-digest spectra to wildcard-instead-of-stop versions of repeat-masked, six-frame, whole-genome translations. Along with singleton stop-to-nonstop events affecting four other lncRNAs, we identified 24 additional peptides with stop-to-nonstop inframe substitutions from multiple MMP24-AS1 lncRNA ORFs. Only UAG and UGA, but not UAA, stop codons were affected. All MMP24-AS1-matching spectra met the same significance thresholds as high-confidence known-protein signatures. Targeted resequencing of MMP24-AS1 genomic DNA and cDNA from the same samples did not reveal any mutations, polymorphisms, or sequencing-detectable RNA editing. We have therefore discovered an apparent gene-specific violation of the genetic code. It highlights the importance of matching peptides to whole-genome, not known-genes-only, ORFs in mass-spectrometry workflows, and suggests a new mechanism enhancing the combinatorial complexity of the proteome.
机译:FANTOM和ENCODE(DNA元素百科全书)联盟的基因目录体现了后基因组时代早期的主要发现之一,即三分之二的人类基因不编码已知蛋白质。那些约40,000种非蛋白质编码(非编码RNA)人类基因(www.gencodegenes.org)仍然知之甚少。长非编码RNA(lncRNA)基因是人类ncRNA基因中数量最多的一类。数百种lncRNA具有最近发现的功能,现在被认为是正常细胞功能和广泛人类疾病中基因表达的基本核和细胞质,表观遗传和转录后,正负调节剂。但是,绝大多数lncRNA的功能(如果有的话)仍然不清楚。重要的是,尚未正式排除其转录本作为意想不到的实际信使RNA的非常规作用。核糖体谱(Riboseq)预测翻译潜力;但是,如果没有与lncRNA开放阅读框(ORF)匹配的蛋白质的独立证据,核糖体结合不能证明翻译。我们是第一个使用质谱分析技术在人细胞中翻译特定lncRNA的人(https://genome.cshlp.org/content/22/9/1646.long)。现在,我们对人MCF7乳腺癌细胞中的lncRNA翻译进行了全球搜索,整合了链特异性RNAseq,Riboseq和胰蛋白酶消化的<的深层质谱分析法。 UHPLC(Orbitrap MS / MS)之后的15kDa馏分由两个独立的核心设施组成,一式四份进行生物学复制。我们排除了已知蛋白质的匹配。 UCSC基因组浏览器辅助的不完全(胰蛋白酶消化肽)到(lncRNA三帧翻译)比对的手动注释最初揭示了三种肽,假设通过终止密码子的“从不间断”框内置换可以推测出三种肽通过lncRNA MMP24-AS1的两个ORF中的氨基酸进行检测。为了在全基因组范围内搜索这种现象,我们设计并实现了前所未有的计算流程,将胰蛋白酶消化的谱图与重复遮罩的六帧全基因组翻译的通配符而不是停下来的版本进行匹配。连同影响其他四个lncRNA的单点停止-不间断事件,我们从多个MMP24-AS1 lncRNA ORF中鉴定了24种其他具有停止-不间断框内替代的肽。仅UAG和UGA,终止密码子不受UAA影响。所有与MMP24-AS1匹配的光谱均达到与高可信度已知蛋白特征相同的显着性阈值。来自相同样品的MMP24-AS1基因组DNA和cDNA的靶向重测序未发现任何突变,多态性或测序可检测的RNA编辑。因此,我们发现了明显的特定基因违反遗传密码的行为。它强调了在质谱工作流程中将肽与全基因组而不是仅已知基因的ORF匹配的重要性,并提出了一种新的机制来增强蛋白质组的组合复杂性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号