首页> 外文会议>International Conference on Web Research >Designing a Deep Neural Network Model for Finding Semantic Similarity Between Short Persian Texts Using a Parallel Corpus

【24h】

Designing a Deep Neural Network Model for Finding Semantic Similarity Between Short Persian Texts Using a Parallel Corpus

机译：使用并行语料库设计深度波斯文本中的语义相似性的深度神经网络模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text processing, as one of the main issues in the field of artificial intelligence, has received a lot of attention in recent decades. Numerous methods and algorithms are proposed to address the task of semantic textual similarity which is one of the sub-branches of text processing. Due to the special features of the Persian language and its non-standard writing system, finding semantic similarity is an even more challenging task in Persian. On the other hand, producing a proper corpus that can be used for training a model for finding semantic similarities, is of great importance. In this study, the main purpose is to propose a method for measuring the semantic similarity between short Persian texts. To do so, first, we try to build an appropriate corpus, and then propose an efficient approach based on neural networks. The proposed method involves three steps. The first step is data collection and building a parallel corpus. In the next step, namely the pre-processing step, the data is normalized. Finally, Semantic similarity recognition is done by the neural network using vector representations of the words. The suggested model is built upon the produced corpus made of movie and tv show subtitles containing 35266 sentence pairs. The F-measure of the proposed approach on PAN2016 is 75.98% with 4 tags and 98.87% with 2 tags. We also achieved an F-measure of 98.86% for our model tested on the parallel corpus with 2 tags.

机译：文本处理是人工智能领域的主要问题之一，近几十年来受到很多关注。提出了许多方法和算法来解决语义文本相似性的任务，这是文本处理的子分支之一。由于波斯语言及其非标准写作系统的特点，发现语义相似性是波斯语中更具挑战性的任务。另一方面，生产可用于培训用于寻找语义相似性的模型的适当语料库，这是非常重要的。在这项研究中，主要目的是提出一种测量短期波斯文本之间的语义相似性的方法。为此，首先，我们尝试构建适当的语料库，然后提出基于神经网络的有效方法。该方法涉及三个步骤。第一步是数据收集并构建并行语料库。在下一步中，即预处理步骤，数据被归一化。最后，语义相似性识别由神经网络使用单词的矢量表示来完成。建议的模型建立在由包含35266句对的电影和电视节目字幕制成的制作语料库上。 Pan2016上提出的方法的F测量值为75.98％，4标签和98.87％，2标签。我们还达到了在与2标签上的并联语料库上测试的模型的F-Mabote为98.86％。

著录项

来源
《International Conference on Web Research》|2021年|91-96|共6页
会议地点
作者
Zahra Sadat Hosseini Moghadam Emami; Shohreh Tabatabayiseifi; Mohammad Izadi; Mohammad Tavakoli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep learning; Training; TV; Text recognition; Semantics; Neural networks; Writing;

机译：深入学习;培训;电视;文本识别;语义;神经网络;写作;

相似文献

外文文献
中文文献
专利

1. A multi-label text classification method via dynamic semantic representation model and deep neural network [J] . Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2020,第8期

机译：通过动态语义表示模型和深神经网络的多标签文本分类方法
2. Using Part-of-Speech Tags as Deep-Syntax Indicators in Determining Short-Text Semantic Similarity [J] . Dragan Boji??, Vuk Batanovi?? Computer Science and Information Systems . 2015,第1期

机译：在确定短文本语义相似性时使用词性标签作为深度语法指示符
3. Multi-corpus-Based Model for Measuring the Semantic Relatedness in Short Texts (SRST) [J] . El-Deeb Reem, Al-Zoghby Aya M., Elmougy Samir Arabian Journal for Science and Engineering . 2018,第12期

机译：基于多主体的短文本语义相关性度量模型（SRST）
4. Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement [C] . Hua He, Jimmy Lin Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2016

机译：基于深度神经网络的成对词交互建模，用于语义相似度测量
5. Developing a Cybersecurity Text Corpus and its Application for Augmenting Semantic Text Similarity. [D] . Chavan, Manish Padmakar. 2014

机译：开发网络安全文本语料库及其在增强语义文本相似度中的应用。
6. Detection of medical text semantic similarity based on convolutional neural network [O] . Tao Zheng, Yimei Gao, Fei Wang, 2019

机译：基于卷积神经网络的医学文本语义相似度检测
7. Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement [O] . Hua He, Jimmy Lin 2016

机译：与深神经网络进行语义相似性测量的成对词交互建模

Designing a Deep Neural Network Model for Finding Semantic Similarity Between Short Persian Texts Using a Parallel Corpus

摘要

著录项

相似文献

相关主题

期刊订阅