...
首页> 外文期刊>Expert systems with applications >Trendlets: A novel probabilistic representational structures for clustering the time series data
【24h】

Trendlets: A novel probabilistic representational structures for clustering the time series data

机译:趋势:一种用于聚类时间序列数据的新型概率表结构

获取原文
获取原文并翻译 | 示例
           

摘要

Time series data is a sequence of values recorded systematically over a period which are mostly used for prediction, clustering, and analysis. The two essential features of a time series data are trend and seasonality. Preprocessing of the time series data is necessary for performing prediction tasks. In most of the cases, the trend and the seasonality are removed before applying the regression algorithms. The accuracy of such algorithms depends upon the functions used for the removal of trend and seasonality. Clustering of an unlabeled time series data with the presence of trend and seasonality is challenging. In this paper, we propose a probabilistic representational learning method for grouping the time series data. We introduce five terminologies in our method of clustering namely the trendlets, uplets, downlets, equalets and trendlet string. These elements are the representational building blocks of our proposed method. Experiments on the proposed algorithm are performed with the renewable energy data on the electricity supply system of continental Europe which includes the demand and inflow of renewable energy for the term 2012 to 2014 and UCR-2018 time series archive containing 128 datasets. We compared our proposed representational method with various clustering algorithms using the silhouette score. Mini-batch k-means and agglomerative hierarchical clustering algorithms show better performance in terms of quality, logical accordance with data and time taken for clustering. (C) 2019 Elsevier Ltd. All rights reserved.
机译:时间序列数据是在主要用于预测,聚类和分析的时期系统地记录的一系列值。时间序列数据的两个基本特征是趋势和季节性。时间序列数据的预处理是执行预测任务所必需的。在大多数情况下,在应用回归算法之前将删除趋势和季节性。这种算法的准确性取决于用于去除趋势和季节性的功能。在存在趋势和季节性的情况下,将未标记的时间序列数据进行挑战性挑战。在本文中,我们提出了一种用于对时间序列数据进行分组的概率表学习方法。我们在我们的聚类方法中介绍了五个术语,即趋势,上升,下班,同步和曲折字符串。这些元素是我们所提出的方法的代表性构建块。在欧洲大陆欧洲电力供应系统上的可再生能源数据进行了关于所提出的算法的实验,包括可再生能源的需求和流入2012年至2014年的可再生能源和UCR-2018时间序列归档,其中包含128个数据集。我们将所提出的代表方法与各种聚类算法进行了比较了使用轮廓分数。迷你批量k均值和附聚层间聚类算法在质量方面具有更好的性能,逻辑按照用于聚类所采取的数据和时间。 (c)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号