首页> 中文期刊> 《北京工业大学学报》 >基于费希尔信息度量的随机近邻嵌入算法

基于费希尔信息度量的随机近邻嵌入算法

         

摘要

To improve the classification accuracy of text classification, Fisher information metric based on stochastic neighbor embedding ( FIMSNE) was proposed. In this paper, text word frequency vectors were taken as probabilistic density functions that were points on a statistical manifold, and their distances were defined by Fisher information metric. From the view of information geometry, t-stochastic neighbor embedding ( t-SNE ) was improved to FIMSNE. That FIMSNE outperforms t-SNE, Fisher information nonparametric embedding ( FINE) and principal components analysis ( PCA) in the whole was verified with 2D-embedding and classification task to real text dataset.%为提高文本分类的准确率,提出了费希尔信息度量随机近邻嵌入算法( Fisher information metric based on stochastic neighbor embedding, FIMSNE)。首先,把文本的词频向量看作统计流形上的概率密度样本点,利用费希尔信息度量计算样本点之间的距离;然后,从信息几何的观点出发,对 t 分布随机近邻嵌入( t-stochastic neighbor embedding, t-SNE)进行改进,实现了新算法。真实文本数据集上的二维嵌入和分类实验的结果表明:FIMSNE的性能在总体上优于t-SNE、费希尔信息非参数嵌入( Fisher information nonparametric embedding,FINE)和主成分分析( principal components analysis,PCA)。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号