首页> 外文期刊>Pomiary Automatyka Kontrola >A study of parallel techniques for dimensionality reduction and its impact on the quality of text processing algorithms
【24h】

A study of parallel techniques for dimensionality reduction and its impact on the quality of text processing algorithms

机译:降维并行技术及其对文本处理算法质量的影响研究

获取原文
获取原文并翻译 | 示例
           

摘要

The presented algorithms employ the Vector Space Model (VSM) and its enhancements such as TFIDF (Term Frequency Inverse Document Frequency) with Singular Value Decomposition (SVD). TFIDF were applied to emphasize the important features of documents and SVD was used to reduce the analysis space. Consequently, a series of experiments were conducted. They revealed important properties of the algorithms and their accuracy. The accuracy of the algorithms was estimated in terms of their ability to match the human classification of the subject. For unsupervised algorithms the entropy was used as a quality evaluation measure. The combination of VSM, TFIDF, and SVD came out to be the best performing unsupervised algorithm with entropy of 0.16.
机译:提出的算法采用向量空间模型(VSM)及其增强功能,例如带有奇异值分解(SVD)的TFIDF(词频逆文档频率)。 TFIDF用于强调文档的重要特征,而SVD用于减少分析空间。因此,进行了一系列实验。他们揭示了算法的重要特性及其准确性。根据算法匹配受试者的人类分类的能力来估计算法的准确性。对于无监督算法,将熵用作质量评估手段。 VSM,TFIDF和SVD的组合是性能最好的无监督算法,熵为0.16。

著录项

  • 来源
    《Pomiary Automatyka Kontrola》 |2015年第7期|352-354|共3页
  • 作者单位

    AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, 30 Mickiewicza Ave., 30-059 Krakow, Poland ACC CYFRONET AGH, 11 Nawojki St., 30-950 Krakow, Poland;

    AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, 30 Mickiewicza Ave., 30-059 Krakow, Poland ACC CYFRONET AGH, 11 Nawojki St., 30-950 Krakow, Poland;

    AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, 30 Mickiewicza Ave., 30-059 Krakow, Poland ACC CYFRONET AGH, 11 Nawojki St., 30-950 Krakow, Poland;

    AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, 30 Mickiewicza Ave., 30-059 Krakow, Poland ACC CYFRONET AGH, 11 Nawojki St., 30-950 Krakow, Poland;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Singular Value Decomposition; Vector Space Model; TFIDF;

    机译:奇异值分解;向量空间模型;外国金融发展基金会;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号