首页> 外文期刊>Data & Knowledge Engineering >A novel representation in three-dimensions for high dimensional data sets
【24h】

A novel representation in three-dimensions for high dimensional data sets

机译:高维数据集的三维新颖表示形式

获取原文
获取原文并翻译 | 示例
           

摘要

Data representation is an important topic in the field of data engineering. In this paper, we focus on the representation of high dimensional data sets. We present the construction method of the set-valued mapping in 3-C representation and propose a novel representation algorithm based on K-means clustering method. The main contribution is to obtain the cluster centers of these high dimensional data sets, and get the correspondence coordinates in 3-C space with the projection along the center's direction. To verify the effectiveness of the proposed method, three sections of experiments had been completed. The first one is ten data sets from UCI. The second one is web images from Corel5k. The last one is the syllabus, a data set consists of text documents from the MIT OpenCourseWare project. All of the results can make sure that the corresponding similarity of data points or attributes are displayed clearly and show that the proposed algorithm's feasibility and scalability. Especially, the results on web images and syllabus are very excellent. As a result, the proposed representation algorithm in three dimension space will make significant influence on data classification and dimensionality reduction.
机译:数据表示是数据工程领域的重要主题。在本文中,我们专注于高维数据集的表示。我们提出了3-C表示中的集值映射的构造方法,并提出了一种基于K均值聚类的新颖表示算法。主要的贡献是获得这些高维数据集的聚类中心,并获得3-C空间中与沿中心方向的投影的对应坐标。为了验证所提方法的有效性,已完成了三个部分的实验。第一个是来自UCI的十个数据集。第二个是Corel5k的Web图像。最后一个是课程提纲,一个数据集,由MIT OpenCourseWare项目的文本文档组成。所有的结果都可以确保清楚地显示数据点或属性的相应相似性,并表明所提算法的可行性和可扩展性。特别是,在网络图像和课程表上的成绩非常好。结果,提出的三维空间表示算法将对数据分类和降维产生重大影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号