...
首页> 外文期刊>Journal of computer sciences >FAST REAL TIME ANALYSIS OF WEB SERVER MASSIVE LOG FILES USING AN IMPROVED WEB MINING ARCHITECTURE
【24h】

FAST REAL TIME ANALYSIS OF WEB SERVER MASSIVE LOG FILES USING AN IMPROVED WEB MINING ARCHITECTURE

机译:使用改进的Web挖掘体系结构对Web服务器大量日志文件进行快速实时分析

获取原文
获取原文并翻译 | 示例
           

摘要

The web has played a vital role to detect the information and finding the reasons to organize a system. As the web sites were increased, the web log files also increased based on the web searching. Our challenge and the task are to reduce the log files and classify the best results to reach the task which we used. Aimed to overcome the deficiency of abundant data to web mining, the study proposed a path extraction using Euclidean Distance based algorithm with a sequential pattern clustering mining algorithm. First, we construct the Relational Information System using original data sets. Second, we here cluster the data by the Sequential Pattern Clustering Method for the data sets which make use of the data to produce Core of Information System. Web mining core data is the most important and necessary information which cannot reduce an original Information System. So it can get the same effect as original data sets to data analysis and can construct classification modeling using it. Third, we here used Sequential pattern clustering method with the help of Path Extraction. The experiment shows that the proposed algorithm can get high efficiency and avoid the abundant data in follow-up data processing.
机译:网络在检测信息和找到组织系统的原因方面起着至关重要的作用。随着网站的增加,基于Web搜索的Web日志文件也增加了。我们面临的挑战和任务是减少日志文件并分类最佳结果,以完成我们使用的任务。为了克服网络挖掘中大量数据的不足,该研究提出了一种基于欧氏距离的算法和顺序模式聚类挖掘算法的路径提取方法。首先,我们使用原始数据集构建关系信息系统。其次,在这里,我们通过顺序模式聚类方法对数据集进行数据聚类,这些数据集利用数据来生成信息系统的核心。 Web挖掘核心数据是最重要和必要的信息,不能减少原始的信息系统。因此它可以在数据分析中获得与原始数据集相同的效果,并可以使用它来构建分类模型。第三,我们在路径提取的帮助下使用了顺序模式聚类方法。实验表明,该算法在后续数据处理中具有较高的效率,并且避免了大量的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号