...
首页> 外文期刊>Indian Journal of Science and Technology >An Approach to Perform Uncertainity Analysis on a Spatial Dataset using Clustering and Distance based Outlier Detection Technique
【24h】

An Approach to Perform Uncertainity Analysis on a Spatial Dataset using Clustering and Distance based Outlier Detection Technique

机译:一种基于聚类和距离的离群值检测技术对空间数据集进行不确定性分析的方法

获取原文
           

摘要

Background: In past years, many methods have been implemented for maintaining and supervising uncertain data that may occur due to collection of data in new ways which results in missing values, erroneous data. The main aim of this work is to help the end user to get correct information about spatial data. Method: The behaviour of data as an outlier is the result of uncertainty. The challenge in spatial data sets is to cluster uncertain objects. Hence, unsupervised clustering can be used to deal with this type of data. In this paper, the difficulty of outlier detection with uncertain data is examined. Finding: To improve the performance and quality, Voronoi Diagram is used which partition the objects into each cell and helps to see the exact location of an object. The integral part is the pre-processing step of removing uncertainty to avoid wrong interpretation. Furthermore, CLARA (Clustering LARge Applications) algorithm is applied to produce the high quality clusters. It has an in-built function of outlier detection too and it is suitable for large data set. This algorithm uses Mahalanobis Distance to calculate the distance between cluster and its members, to remove outliers and reduce uncertainty for feasible and supporting inputs. This procedure can be a valid provision to be use in geo-database creation. Improvement: The methodology can be enhanced by designing the procedure to develop a Decision Support System (DSS) for spatial database creation.
机译:背景:在过去的几年中,已经采取了许多方法来维护和监督由于以新方式收集数据而可能导致的不确定数据的出现,从而导致数据丢失,错误数据。这项工作的主要目的是帮助最终用户获得有关空间数据的正确信息。方法:数据作为异常值的行为是不确定性的结果。空间数据集中的挑战是将不确定的对象聚类。因此,无监督聚类可用于处理此类数据。本文研究了不确定数据异常检测的难点。发现:为了提高性能和质量,使用了Voronoi图,该图将对象划分到每个单元格中,并有助于查看对象的确切位置。不可分割的部分是消除不确定性以避免错误解释的预处理步骤。此外,使用CLARA(聚类大应用程序)算法来生成高质量的聚类。它还具有内置的离群值检测功能,适用于大型数据集。该算法使用马氏距离计算聚类与其成员之间的距离,以消除异常值并减少可行和支持性输入的不确定性。此过程可以是在地理数据库创建中使用的有效规定。改进:通过设计用于开发用于空间数据库创建的决策支持系统(DSS)的过程,可以增强方法论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号