基于词向量的专利自动扩展查询研究

刘梦兰; 刘斌; 彭智勇

首页> 中文期刊> 《计算机工程与科学》 >基于词向量的专利自动扩展查询研究

基于词向量的专利自动扩展查询研究

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Patent retrieval is very different from information retrieval.Patent texts include right statement,abstract and full text,so we cannot simply apply the retrieval algorithms for common texts to patent retrieval.Patent retrieval usually faces the problem of low recall rate.Firstly,due to the highly professional and complex expression and terms of patent texts,it is not easy to capture the search intent from users' queries,eventually leading to unsatisfactory search results.Secondly,inventors consciously create some distinctive words when they write patent texts to avoid being retrieved.Many retrieval algorithms are designed to improve the recall rate,however,many problems remain to be solved and the effectiveness be improved.We propose an automatic patent query expansion model based on word embedding.On the basis of word embedding,a keyword network in patent domain is constructed,and then the dense subgraph discovery algorithm is used to find expansion terms,which can improve the effectiveness of expansion terms.Extensive experiments on the CLEF-IP 2012 dataset show that the proposed algorithm can guarantee the flexibility and effectiveness of expansion terms and improve the recall rate of patent retrieval.%专利检索与普通的文本检索有着极大的不同,专利文本包括权利声明、摘要、全文等不同部分,自然不能简单地将普通文本的检索方法应用到专利检索当中来.专利检索通常面临着召回率低下的问题,首先,由于专利文本具有极强的专业性,有着复杂的术语表达方式,用户输入的关键词通常无法明确捕捉到检索意图,导致检索结果不理想.其次,专利撰写时有意识地制造与众不同的词汇,导致相关专利无法被检索到.目前有很多的研究方法都旨在提高专利检索的召回率,但是仍然有许多问题有待解决,检索效果有待改善.提出了一个基于词向量的专利自动扩展查询方法,在词向量的基础上,构建一个关键词查询网络,通过稠密子图发现算法来寻找扩展词集合,提高扩展词的有效性.在CLEF-IP 2012数据集的基础上进行了充分的实验,实验结果表明,本文提出的算法能够保证扩展词集获取的灵活性和有效性,同时能进一步提高专利检索的召回率.

著录项

来源
《计算机工程与科学》 |2017年第12期|2297-2305|共9页
作者
刘梦兰; 刘斌; 彭智勇;
展开▼
作者单位

武汉大学软件工程国家重点实验室;

湖北武汉430072;

武汉大学计算机学院;

湖北武汉430072;

武汉大学软件工程国家重点实验室;

湖北武汉430072;

武汉大学计算机学院;

湖北武汉430072;

武汉大学软件工程国家重点实验室;

湖北武汉430072;

武汉大学计算机学院;

湖北武汉430072;

展开▼
原文格式 PDF
正文语种 chi
中图分类信息处理（信息加工）;
关键词
专利检索; 扩展查询; 词向量; 深度学习;

相似文献

中文文献
外文文献
专利

1. 专利查询扩展的词向量方法研究 [J] . 许侃 ,林原 ,曲忱 . 计算机科学与探索 . 2018,第006期
2. 基于跨语言词向量模型的蒙汉查询词扩展方法研究 [J] . 马路佳 ,赖文 ,赵小兵 . 中文信息学报 . 2019,第006期
3. 基于自动查询扩展的专利文档检索方法 [J] . 羊帅 ,王锋 ,林兰芬 . 中国科技论文 . 2013,第010期
4. 基于自动查询扩展的专利文档检索方法 [J] . 羊帅 ,王锋 ,林兰芬 . 中国科技论文 . 2013,第010期
5. 基于不同信息资源专利查询扩展方法的研究 [J] . 许侃 ,林原 ,林鸿飞 . 情报学报 . 2016,第006期
6. 基于跨语言词向量模型的蒙汉查询词扩展方法研究 [C] . 马路佳 ,赵小兵 ,赖文 . 第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会（CCL 2018） . 2018
7. 基于词向量的农业生产知识查询扩展研究 [A] . 徐志文 . 2018

基于词向量的专利自动扩展查询研究

摘要

著录项

相似文献

相关主题

期刊订阅