首页> 外国专利> Method for feature selection and for evaluating features identified as significant for classifying data

Method for feature selection and for evaluating features identified as significant for classifying data

机译:用于选择特征和评估对数据分类有意义的特征的方法

摘要

A group of features that has been identified as “significant” in being able to separate data into classes is evaluated using a support vector machine which separates the dataset into classes one feature at a time. After separation, an extremal margin value is assigned to each feature based on the distance between the lowest feature value in the first class and the highest feature value in the second class. Separately, extremal margin values are calculated for a normal distribution within a large number of randomly drawn example sets for the two classes to determine the number of examples within the normal distribution that would have a specified extremal margin value. Using p-values calculated for the normal distribution, a desired p-value is selected. The specified extremal margin value corresponding to the selected p-value is compared to the calculated extremal margin values for the group of features. The features in the group that have a calculated extremal margin value less than the specified margin value are labeled as falsely significant.
机译:使用支持向量机对一组已被识别为“重要”的要素(能够将数据划分为类别)进行评估,该支持向量机一次将数据集划分为一个要素。分离后,根据第一类中最低特征值与第二类中最高特征值之间的距离,将极值裕度值分配给每个特征。分别地,针对两个类别的大量随机绘制的示例集内的正态分布计算极值裕度值,以确定正态分布内具有指定极值裕度值的实例数。使用为正态分布计算的p值,选择所需的p值。将对应于所选p值的指定极值边界值与该特征组的计算极值边界值进行比较。计算出的极值裕度值小于指定裕度值的组中的要素被标记为错误有效。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号