Prediction of enhancer-promoter interactions via natural language processing

Wanwen Zeng; Mengmeng Wu; Rui Jiang

首页> 外文期刊>BMC Genomics >Prediction of enhancer-promoter interactions via natural language processing

【24h】

Prediction of enhancer-promoter interactions via natural language processing

机译：通过自然语言处理预测增强子与启动子的相互作用

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Precise identification of three-dimensional genome organization, especially enhancer-promoter interactions (EPIs), is important to deciphering gene regulation, cell differentiation and disease mechanisms. Currently, it is a challenging task to distinguish true interactions from other nearby non-interacting ones since the power of traditional experimental methods is limited due to low resolution or low throughput. We propose a novel computational framework EP2vec to assay three-dimensional genomic interactions. We first extract sequence embedding features, defined as fixed-length vector representations learned from variable-length sequences using an unsupervised deep learning method in natural language processing. Then, we train a classifier to predict EPIs using the learned representations in supervised way. Experimental results demonstrate that EP2vec obtains F1 scores ranging from 0.841~?0.933 on different datasets, which outperforms existing methods. We prove the robustness of sequence embedding features by carrying out sensitivity analysis. Besides, we identify motifs that represent cell line-specific information through analysis of the learned sequence embedding features by adopting attention mechanism. Last, we show that even superior performance with F1 scores 0.889~?0.940 can be achieved by combining sequence embedding features and experimental features. EP2vec sheds light on feature extraction for DNA sequences of arbitrary lengths and provides a powerful approach for EPIs identification.

机译：精确识别三维基因组组织，尤其是增强子-启动子相互作用（EPI），对于破译基因调控，细胞分化和疾病机制非常重要。当前，区分真正的相互作用与附近其他非相互作用的相互作用是一项艰巨的任务，因为由于分辨率低或通量低，传统实验方法的功能受到限制。我们提出了一种新颖的计算框架EP2vec来分析三维基因组相互作用。我们首先提取序列嵌入特征，定义为在自然语言处理中使用无监督的深度学习方法从可变长度序列中学习的固定长度向量表示形式。然后，我们训练分类器以监督方式使用学习到的表示来预测EPI。实验结果表明，EP2vec在不同数据集上获得的F1分数在0.841〜？0.933之间，优于现有方法。我们通过进行敏感性分析证明了序列嵌入特征的鲁棒性。此外，我们通过采用注意机制对学习到的序列嵌入特征进行分析，从而确定了代表细胞系特定信息的基序。最后，我们证明通过结合序列嵌入特征和实验特征，甚至可以获得F1分数为0.889〜？0.940的优异性能。 EP2vec揭示了任意长度DNA序列的特征提取，并为EPI的识别提供了一种有力的方法。

著录项

来源
《BMC Genomics》 |2018年第2期|共页
作者
Wanwen Zeng; Mengmeng Wu; Rui Jiang;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医学遗传学;
关键词

相似文献

外文文献
中文文献
专利

1. Prediction of enhancer-promoter interactions via natural language processing [J] . Wanwen Zeng, Mengmeng Wu, Rui Jiang BMC Genomics . 2018,第2期

机译：通过自然语言处理预测增强子与启动子的相互作用
2. "System and Method for Processing Multi-Modal Device Interactions in a Natural Language Voice Services Environment" in Patent Application Approval Process [J] . Journal of Engineering . 2013,第12期

机译：专利申请批准过程中的“在自然语言语音服务环境中处理多模式设备交互的系统和方法”
3. Quantitative prediction of enhancer-promoter interactions [J] . Belokopytova Polina S., Nuriddinov Miroslav A., Mozheiko Evgeniy A., Genome research . 2020,第1期

机译：增强剂 - 启动子相互作用的定量预测
4. Prediction of enhancer-promoter interactions via natural language processing [C] . Asia-Pacific Bioinformatics Conference . 2018

机译：通过自然语言处理预测增强剂 - 启动子相互作用
5. Sequential decisions and predictions in natural language processing [D] . He, He. 2016

机译：自然语言处理中的顺序决策和预测
6. Prediction of enhancer-promoter interactions via natural language processing [O] . Wanwen Zeng, Mengmeng Wu, Rui Jiang 2018

机译：通过自然语言处理预测增强子与启动子的相互作用
7. Prediction of enhancer-promoter interactions via natural language processing [O] . Wanwen Zeng, Mengmeng Wu, Rui Jiang 2018

机译：通过自然语言处理预测增强剂 - 启动子相互作用

Prediction of enhancer-promoter interactions via natural language processing

摘要

著录项

相似文献

相关主题

期刊订阅