首页> 中文期刊> 《情报学报》 >基于分层分割的科研领域文本信息挖掘

基于分层分割的科研领域文本信息挖掘

         

摘要

This paper proposes a hierarchical segmentation method for text processing. A hierarchical information model is presented based on structured information of scientific literature and specialized information extraction and processing methods are introduced for each node in the model. For the content of text node,a range similarity for the text segmentation is proposed. According to the result of segmentation,keywords extraction,and similarity analysis,deep text mining methods for scientific literature are implemented. The experimental results show that the weighted similarity measure of hierarchical information model for scientific literature is suitable for the hot-spot content finding. And hierarchical information model and text segmentation are useful for content analysis and content mining of scientific literature for multiple purposes.%本文提出了一种分层分割的文本处理方法,根据科研文献的良构信息,将科研文献构建成分层信息模型,对于模型中不同节点,给出了相应的信息提取方法。对正文节点内容,提出一种间隔相似度计算方法进行文本分割,根据分割的结果进行主题词提取并根据提取结果进行科研文献相似性分析与文本挖掘。实验结果表明,科研文献分层信息模型的加权相似度适用于科研热点内容发现,分层分割能够实现科研文献不同节点内容对比分析与科研内容的挖掘。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号