首页> 外文会议> >HDGSOMr: a high dimensional growing self-organizing map using randomness for efficient Web and text mining
【24h】

HDGSOMr: a high dimensional growing self-organizing map using randomness for efficient Web and text mining

机译:HDGSOMr:使用随机性的高维增长自组织地图,可进行有效的Web和文本挖掘

获取原文

摘要

Mining of text data from the Web has become a necessity in modern days due to the volumes of data available on the Web. While searching for information on the Web using search engines is popular, to analyze the content on large collections of Web pages, feature map techniques are still popular. One of the problems associated with processing large collections of text data from the Web using feature map techniques is the time taken to cluster them. This paper presents an algorithm based on a growing variant of the self organizing map called the HDGSOMr. This novel algorithm incorporates randomness into the self-organizing process to produce higher quality clusters within few epochs and utilizing smaller neighborhood sizes resulting in a significant reduction in overall processing time. Details of the HDGSOMr algorithm and results of processing large collections of text data proving the efficiency of the algorithm are also presented.
机译:由于网络上可用的数据量,来自Web的文本数据挖掘已成为现代日的必需品。在使用搜索引擎上搜索Web上的信息时,正在流行的,分析大量网页上的内容,特征地图技术仍然很流行。与使用特征映射技术处理来自Web的大量文本数据相关的问题之一是群集它们的时间。本文介绍了一种基于称为HDGSOMR的自组织地图的不断增长变体的算法。该新颖算法将随机性掺入自组织过程中,以在很少的时期内产生更高质量的簇,并利用较小的邻域尺寸,从而导致整体处理时间显着降低。还提出了HDGSOMR算法的详细信息,以及处理算法效率的大量文本数据的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号