首页> 外文会议>ISPA 2007 international workshops, SSDSN, UPWN, WISH, SGC, ParDMCom, HiPCoMB, and IST-AWSN; 20070829-31; Niagara Falls(CA) >GA Based Optimal Keyword Extraction in an Automatic Chinese Web Document Classification System
【24h】

GA Based Optimal Keyword Extraction in an Automatic Chinese Web Document Classification System

机译:中文文档自动分类系统中基于遗传算法的最佳关键词提取

获取原文
获取原文并翻译 | 示例

摘要

The main steps for designing an automatic document classification system include feature extraction and classification. In this paper a method to improve feature extraction is proposed. In this method, genetic algorithm (GA) was applied to determine the threshold values of four criteria for extracting the representative keywords for each class. The purpose of these four threshold values is to extract as few representative keywords as possible. This keyword extraction method was combined with two classification algorithms, vector space model (VSM) and support vector machine (SVM), for examining the performance of the proposed classification system under various extracting conditions.
机译:设计自动文档分类系统的主要步骤包括特征提取和分类。本文提出了一种改进特征提取的方法。在这种方法中,遗传算法(GA)用于确定四个标准的阈值,以提取每个类别的代表性关键字。这四个阈值的目的是提取尽可能少的代表性关键字。该关键字提取方法与矢量空间模型(VSM)和支持向量机(SVM)两种分类算法相结合,用于检验所提出分类系统在各种提取条件下的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号