首页> 中文期刊> 《计算机工程与科学》 >一种基于文本相似度矩阵运算的非结构化海量投诉数据分类算法

一种基于文本相似度矩阵运算的非结构化海量投诉数据分类算法

         

摘要

With the fast development of the Internet and information technology nowadays, the growth of the volume of unstructured data is exponential. In particular, the prevalence of the Web 2. 0 network community further enlarges the growth tendency. Therefore, how to manage and organize large-scale unstructured data effectively, so as to facilitate end-user information access, becomes an urgent and important research topic. In this paper, based on the text of unstructured data modeling and text similarity, the existing large-scale unstructured data classification algorithms are surveyed and discussed, and they are applied to a China Mobile user complaint data classification system. Upon the latter, the effectiveness of processing the complaint data is shown to have been much improved, and the usage of our proposed classification algorithm and system architecture is verified.%随着互联网和信息技术的日新月异,非结构化数据量有呈几何级数增长的趋势.尤其是Web 2.0网络社区的流行与火爆,使得增长趋势得到了进一步的加速.因此,面对海量的非结构化数据,如何有效地管理和组织它们,以便于终端用户进行信息存取,成为了一个迫在眉睫的重要研究课题.本文通过对非结构化数据的文本的建模和文本相似度比较,对于大规模非结构化数据的分类算法进行了讨论和研究,并将此算法应用到了中国移动的投诉数据分类系统中.在系统实施后,非常有效地提高了投诉数据的处理效率,从而印证所提出分类算法及系统框架的有效性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号