【24h】

Web Document Classification Using Changing Training Data Set

机译:使用更改的培训数据集进行Web文档分类

获取原文
获取原文并翻译 | 示例

摘要

Machine learning methods are generally employed to acquire the knowledge for automated document classification. They can be used if a large set of pre-sampled training set is available and the domain does not change rapidly. However, it is not easy to get a complete trained data set in the real world. Furthermore, the classification knowledge continually changes in different situations. This is known as the maintenance problem or knowledge acquisition bottleneck problem. Multiple Classification Ripple-Down Rules (MCRDR), an incremental knowledge acquisition method, was introduced to resolve this problem and has been applied in several commercial expert systems and a document classification system. Evaluation results for several domains show that our MCRDR based document classification method can be successfully applied in the real world document classification task.
机译:通常采用机器学习方法来获取用于自动文档分类的知识。如果有大量的预采样训练集可用并且域不会快速变化,则可以使用它们。但是,在现实世界中获得完整的训练数据集并不容易。此外,分类知识在不同情况下会不断变化。这被称为维护问题或知识获取瓶颈问题。多重分类波纹下降规则(MCRDR)是一种增量知识获取方法,已被引入来解决此问题,并已应用于一些商业专家系统和文档分类系统中。对多个领域的评估结果表明,基于MCRDR的文档分类方法可以成功地应用于现实世界中的文档分类任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号