Hidden Space and Segmented Labelling in Text Visualization

机译：文本可视化中隐藏的空间和分段标签

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text visualization can interpret large size documents with various linguistic units and glyphs. Each unit owns its advantages of intuition and precision, which could be visualize under different space efficiencies. For example, histogram of word frequency is an intuitive glyph but not precise, and word embedding could be optimized globally but not intuitive. Previous studies have applied many linguistic units and glyphs with implicit combinations, but lack an approach to align those units explicitly to sustain interpretability and predicability. In another side, ever growing methods of feature reduction and selection require a framework to compare and interpret hidden spaces with regard to large volume documents. To align and visualize linguistic units intensively, we proposed a visualization method to interpret document with its distribution on continuous space. Also, the accuracy of the segmented labelling is compared with the min-max and entropy methods. The result shows that: 1) our visualization is flexibility and efficiency to exhibit large volume documents; 2) of feature selection accuracy, the segmented labelling has comparable advantage on various parameters of hidden space.

机译：文本可视化可以用各种语言单位和字形解释大尺寸文档。每个单元都拥有其直觉和精度的优点，可以在不同的空间效率下可视化。例如，单词频率的直方图是直观的字形，但不是精确的字形，并且可以全局优化Word嵌入但不直观。以前的研究已经应用了许多语言单位和具有隐式组合的字形，但缺乏明确对齐这些单位以维持可解释性和预测性的方法。在另一方面，越来越多的特征减少和选择方法需要一个框架来比较和解释关于大批量文件的隐藏空间。为了对齐和可视化语言单位，我们提出了一种可视化方法，以解释其在连续空间上的分布。此外，将分段标记的准确性与MIN-MAX和熵方法进行比较。结果表明：1）我们的可视化是展示大容量文件的灵活性和效率; 2）特征选择精度，分段标签在隐藏空间的各种参数上具有可比的优势。

著录项

来源
《International Academic Conference on Frontiers in Social Sciences and Management Innovation》|2020年|451p|共12页
会议地点
作者
Xiaoguang ZHU; Xin CAI; Peiyao NIE;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 C93-53;
关键词
Hidden space; Segmented labelling; Feature reduction; Text visualization; Model interpretation;

机译：隐藏的空间;分段标签;特征减少;文本可视化;模型解释;

相似文献

外文文献
中文文献
专利

1. Multi label text classification method based on co-occurrence latent semantic vector space [J] . Rujuan Wang, Gang Chen, Xin Sui Procedia Computer Science . 2018,第22期

机译：基于共现潜在语义向量空间的多标签文本分类方法
2. Latent Dirichlet Allocation complement in the vector space model for Multi-Label Text Classification [J] . Víctor Carrera-Trejo, Grigori Sidorov, Sabino Miranda-Jiménez, International Journal of Combinatorial Optimization Problems and Informatics . 2015,第1期

机译：向量空间模型中的潜在Dirichlet分配补码，用于多标签文本分类
3. Visualization of the native shape of bodipy‐labeled DNA in Escherichia coliEscherichia coli by correlative microscopy [J] . Loukanov Alexandre, Mladenova Polina, Toshev Svetlin, Microscopy research and technique . 2018,第3期

机译：通过相关显微镜可视化大肠杆菌（Escherichia coli）中Bypipy标记的DNA的本地形状的可视化

4. Hidden Space and Segmented Labelling in Text Visualization [C] . Xiaoguang ZHU, Xin CAI, Peiyao NIE International Academic Conference on Frontiers in Social Sciences and Management Innovation . 2020

机译：文本可视化中隐藏的空间和分段标签

5. Finding hidden road segments by determining a cost surface from visible proximal segments: Discovering the limits of Dean's approach. [D] . Horn, Scott Ellis. 2013

机译：通过从可见的近端路段确定成本表面来查找隐藏的路段：发现Dean方法的局限性。

6. Of text and gene – using text mining methods to uncover hidden knowledge in toxicogenomics [O] . Mikyung Lee, Zhichao Liu, Reagan Kelly, 2014

机译：文本和基因的研究–使用文本挖掘方法发现毒理基因组学中的隐藏知识

7. Muli-label Text Categorization with Hidden Components [O] . Li Li, Longkai Zhang, Houfeng Wang 2015

机译：具有隐藏组件的多标签文本分类

1. 基于同义词扩展和标签传递机制的文本无载体信息隐藏方法 [J] . 张祯 ,倪嘉铭 ,姚晔 . 通信学报 . 2021,第009期

2. 一种基于超文本标签修改和附加字符实现信息隐藏的方法 [J] . 李振宏 ,郑关胜 ,李含光 . 现代计算机（专业版） . 2008,第010期

3. 一种子空间聚类算法在多标签文本分类中应用 [J] . 于海鹏 ,翟红生 . 计算机应用与软件 . 2014,第008期

4. 面向文本的标签云可视化度量模型的研究 [J] . 马明明 ,胡俊 . 软件 . 2018,第005期

5. 文本数据可视化之标签云 [J] . 骆逸欣 . 电子技术与软件工程 . 2017,第013期

6. 阅前即视:文本可视化分析在古籍数字化中应用初探——以社会关系网交互可视系统为例 [C] . 齐天琪 . 第五届中国古籍数字化国际学术研讨会 . 2015

7. 面向文本的标签云可视化度量模型的研究 [A] . 马明明 . 2018

1. 基于同义词扩展和标签传递的文本无载体信息隐藏方法 [P] . 中国专利： CN112989809B . 2021.09.07

2. 基于同义词扩展和标签传递的文本无载体信息隐藏方法 [P] . 中国专利： CN112989809A . 2021-06-18

3. Document search method wherein stored documents and search queries comprise segmented text data of spaced, nonconsecutive text elements and words segmented by predetermined symbols [P] . 外国专利： US5748953A . 1998-05-05

机译：文档搜索方法，其中存储的文档和搜索查询包括间隔开的，非连续的文本元素的分段文本数据以及由预定符号分段的单词

4. TOPIC SPECIFIC LANGUAGE MODEL AND TEXT SEGMENT DIVISION AND LABEL APPLICATION USING USER DIALOGUE BY TOPIC SPECIFIC LABELLING STATISTIC [P] . 外国专利： JP2014059896A . 2014-04-03

机译：主题特定语言模型，文本段划分和标签应用（通过主题特定用户统计使用用户对话）

5. TOPIC SINGULAR LANGUAGE MODEL AND TEXT SEGMENT DIVISION AND LABELING USING USER DIALOG BASED ON TOPIC SINGULAR LABEL STATISTIC [P] . 外国专利： JP2012009046A . 2012-01-12

机译：基于主题单标签统计的用户对话对主题单语言模型和文本段的划分和标记

相关主题

Hidden Space and Segmented Labelling in Text Visualization

摘要

著录项

相似文献

相关主题

期刊订阅