...
首页> 外文期刊>Plant and Cell Physiology >De Novo Assembly of Expressed Transcripts and Global Analysis of the Phalaenopsis aphrodite Transcriptome
【24h】

De Novo Assembly of Expressed Transcripts and Global Analysis of the Phalaenopsis aphrodite Transcriptome

机译:De Novo表达的转录产物组装和蝴蝶兰阿芙罗狄蒂转录组的整体分析

获取原文
获取原文并翻译 | 示例
           

摘要

Being one of the largest families in the angiosperms, Orchidaceae display a great biodiversity resulting from adaptation to diverse habitats. Genomic information on orchids is rather limited, despite their unique and interesting biological features, thus impeding advanced molecular research. Here we report a strategy to integrate sequence outputs of the moth orchid, Phalaenopsis aphrodite, from two high-throughput sequencing platform technologies, Roche 454 and Illumina/Solexa, in order to maximize assembly efficiency. Tissues collected for cDNA library preparation included a wide range of vegetative and reproductive tissues. We also designed an effective workflow for annotation and functional analysis. After assembly and trimming processes, 233,823 unique sequences were obtained. Among them, 42,590 contigs averaging 875 bp in length were annotated to protein-coding genes, of which 7,263 coding genes were found to be nearly full length. The sequence accuracy of the assembled contigs was validated to be as high as 99.9%. Genes with tissue-specific expression were also categorized by profiling analysis with RNA-Seq. Gene products targeted to specific subcellular localizations were identified by their annotations. We concluded that, with proper assembly to combine outputs of next-generation sequencing platforms, transcriptome information can be enriched in gene discovery, functional annotation and expression profiling of a non-model organism.
机译:兰花科是被子植物中最大的科之一,由于适应各种生境,因此具有很大的生物多样性。尽管兰花具有独特而有趣的生物学特性,但有关兰花的基因组信息却十分有限,因此妨碍了先进的分子研究。在这里,我们报告了一种策略,该策略将从两种高通量测序平台技术Roche 454和Illumina / Solexa中整合蝴蝶兰,蝴蝶兰的序列输出,以最大程度地提高组装效率。收集用于cDNA文库制备的组织包括广泛的营养和生殖组织。我们还为注释和功能分析设计了有效的工作流程。经过组装和修剪过程后,获得了233,823个唯一序列。其中,将42,590个平均长度为875 bp的重叠群注释为蛋白质编码基因,其中7,263个编码基因接近全长。组装的重叠群的序列准确性经验证可高达99.9%。具有组织特异性表达的基因也通过用RNA-Seq进行谱分析进行分类。通过注释可以识别针对特定亚细胞定位的基因产物。我们得出的结论是,通过适当组装以组合下一代测序平台的输出,转录组信息可以丰富非模型生物的基因发现,功能注释和表达谱分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号