首页> 外文期刊>International Journal on Informatics Visualization: JOIV >Efficient processing of GRU based on word embedding for text classification
【24h】

Efficient processing of GRU based on word embedding for text classification

机译:基于文本分类的词嵌入的GRU的高效处理

获取原文
           

摘要

Text classification has become very serious problem for big organization to manage the large amount of online data and has been extensively applied in the tasks of Natural Language Processing (NLP). Text classification can support users to excellently manage and exploit meaningful information require to be classified into various categories for further use. In order to best classify texts, our research efforts to develop a deep learning approach which obtains superior performance in text classification than other RNNs approaches. However, the main problem in text classification is how to enhance the classification accuracy and the sparsity of the data semantics sensitivity to context often hinders the classification performance of texts. In order to overcome the weakness, in this paper we proposed unified structure to investigate the effects of word embedding and Gated Recurrent Unit (GRU) for text classification on two benchmark datasets included (Google snippets and TREC). GRU is a well-known type of recurrent neural network (RNN), which is ability of computing sequential data over its recurrent architecture. Experimentally, the semantically connected words are commonly near to each other in embedding spaces. First, words in posts are changed into vectors via word embedding technique. Then, the words sequential in sentences are fed to GRU to extract the contextual semantics between words. The experimental results showed that proposed GRU model can effectively learn the word usage in context of texts provided training data. The quantity and quality of training data significantly affected the performance. We evaluated the performance of proposed approach with traditional recurrent approaches, RNN, MV-RNN and LSTM,?the proposed approach is obtained better results on two benchmark datasets in the term of accuracy and error rate.
机译:文本分类对于大型组织来管理大量在线数据并已广泛应用于自然语言处理(NLP)的任务。文本分类可以支持用户以卓越地管理和利用有意义的信息,要求分类为进一步使用的各个类别。为了最佳分类文本,我们的研究努力开发深入学习方法,在文本分类中获得比其他RNN方法的卓越性能。但是,文本分类中的主要问题是如何提高分类准确性和数据语义敏感性对上下文的伤害常常阻碍文本的分类性能。为了克服弱点,在本文中,我们提出了统一的结构来调查单词嵌入和门控复发单元(GRU)对包括的两个基准数据集(Google Sippets和TREC)的文本分类的影响。 GRU是一种众所周知的复发性神经网络(RNN),是通过其经常性架构计算顺序数据的能力。通过实验,在嵌入空间中,语义连接的单词通常在彼此附近。首先,帖子中的单词通过Word嵌入技术改变为向量。然后,句子中顺序的单词被馈送到GRU以提取单词之间的上下文语义。实验结果表明,所提出的GRU模型可以有效地学习文本背景下的用法使用提供培训数据。培训数据的数量和质量显着影响了性能。我们评估了具有传统的经常性方法,RNN,MV-RNN和LSTM的提出方法的性能,ΔThe在准确度和错误率期间的两个基准数据集中获得了所提出的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号