首页> 外文期刊>Information Sciences: An International Journal >Interval data driven construction of shadowed sets with application to linguistic word modelling
【24h】

Interval data driven construction of shadowed sets with application to linguistic word modelling

机译:应用于语言文字建模的阴影集的间隔数据驱动施工

获取原文
获取原文并翻译 | 示例
           

摘要

The interval data from different surveyed persons for one linguistic word can reflect the intra- and inter-uncertainties of the word. This study shows how to construct shadowed set models for linguistic words based on the surveyed interval data. Firstly, corresponding to the popularly used fuzzy sets for linguistic words, four kinds of shadowed sets are introduced according to their shapes and named as the normal, the left-shoulder, the right-shoulder, and the non-cored shadowed sets. A data-driven approach that utilizes different statistics to determine the shapes and parameters of the shadowed set models is then proposed. The proposed data-driven approach includes two methods; the first is the tolerance limit method depending on the mean and standard deviation of the remaining interval data after pre-processing, whilst the other is the percentile method relying on the percentiles of the remaining interval data. Additionally, to evaluate the modelling performance, three novel indices are presented to measure the uncertainty-capture capability and accuracy of the constructed shadowed set models. Finally, the proposed approach is applied to two real-world problems. One is the modelling of 32 words in a codebook, and the other is the modelling of the thermal feeling words. The proposed methods are compared with other interval data driven methods, e.g. the enhanced interval approach and the fuzzy statistic method. Our results show that the proposed percentile method performs better in both applications. The proposed approach can also be applied to some other linguistic word modelling applications when it is reasonable to adopt shadowed sets as the words' models. (C) 2018 Elsevier Inc. All rights reserved.
机译:来自一个语言词的不同被调查人员的区间数据可以反映这个词的内部不确定性。本研究显示了如何基于受测量的间隔数据构建用于语言单词的阴影集模型。首先,对应于语言单词的普遍使用的模糊集,根据其形状引入四种阴影,并命名为正常,左肩,右肩和非核心遮蔽集。然后提出了一种利用不同统计来确定阴影集模型的形状和参数的数据驱动方法。所提出的数据驱动方法包括两种方法;首先是根据预处理后剩余间隔数据的平均值和标准偏差,第一是公差限制方法,而另一个是依赖于剩余间隔数据的百分比的百分位方法。此外,为了评估建模性能,提出了三个新颖的指标来测量构造的遮蔽集模型的不确定度捕获能力和准确性。最后,拟议的方法适用于两个现实问题。一个是码本中的32个单词的建模,另一个是热感词的建模。将所提出的方法与其他间隔数据驱动方法进行比较,例如,增强的间隔方法和模糊统计方法。我们的结果表明,该百分位方法在这两个应用程序中表现更好。当合理的采用遮蔽集作为单词模型时,所提出的方法也可以应用于其他语言建模应用程序。 (c)2018年Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号