【24h】

Hidden Space and Segmented Labelling in Text Visualization

机译:文本可视化中隐藏的空间和分段标签

获取原文

摘要

Text visualization can interpret large size documents with various linguistic units and glyphs. Each unit owns its advantages of intuition and precision, which could be visualize under different space efficiencies. For example, histogram of word frequency is an intuitive glyph but not precise, and word embedding could be optimized globally but not intuitive. Previous studies have applied many linguistic units and glyphs with implicit combinations, but lack an approach to align those units explicitly to sustain interpretability and predicability. In another side, ever growing methods of feature reduction and selection require a framework to compare and interpret hidden spaces with regard to large volume documents. To align and visualize linguistic units intensively, we proposed a visualization method to interpret document with its distribution on continuous space. Also, the accuracy of the segmented labelling is compared with the min-max and entropy methods. The result shows that: 1) our visualization is flexibility and efficiency to exhibit large volume documents; 2) of feature selection accuracy, the segmented labelling has comparable advantage on various parameters of hidden space.
机译:文本可视化可以用各种语言单位和字形解释大尺寸文档。每个单元都拥有其直觉和精度的优点,可以在不同的空间效率下可视化。例如,单词频率的直方图是直观的字形,但不是精确的字形,并且可以全局优化Word嵌入但不直观。以前的研究已经应用了许多语言单位和具有隐式组合的字形,但缺乏明确对齐这些单位以维持可解释性和预测性的方法。在另一方面,越来越多的特征减少和选择方法需要一个框架来比较和解释关于大批量文件的隐藏空间。为了对齐和可视化语言单位,我们提出了一种可视化方法,以解释其在连续空间上的分布。此外,将分段标记的准确性与MIN-MAX和熵方法进行比较。结果表明:1)我们的可视化是展示大容量文件的灵活性和效率; 2)特征选择精度,分段标签在隐藏空间的各种参数上具有可比的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号