Web Document Classification Using Changing Training Data Set

机译：使用更改的培训数据集进行Web文档分类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machine learning methods are generally employed to acquire the knowledge for automated document classification. They can be used if a large set of pre-sampled training set is available and the domain does not change rapidly. However, it is not easy to get a complete trained data set in the real world. Furthermore, the classification knowledge continually changes in different situations. This is known as the maintenance problem or knowledge acquisition bottleneck problem. Multiple Classification Ripple-Down Rules (MCRDR), an incremental knowledge acquisition method, was introduced to resolve this problem and has been applied in several commercial expert systems and a document classification system. Evaluation results for several domains show that our MCRDR based document classification method can be successfully applied in the real world document classification task.

机译：通常采用机器学习方法来获取用于自动文档分类的知识。如果有大量的预采样训练集可用并且域不会快速变化，则可以使用它们。但是，在现实世界中获得完整的训练数据集并不容易。此外，分类知识在不同情况下会不断变化。这被称为维护问题或知识获取瓶颈问题。多重分类波纹下降规则（MCRDR）是一种增量知识获取方法，已被引入来解决此问题，并已应用于一些商业专家系统和文档分类系统中。对多个领域的评估结果表明，基于MCRDR的文档分类方法可以成功地应用于现实世界中的文档分类任务。

著录项

来源
《International Conference on Computational Science and Its Applications(ICCSA 2006) pt.5; 20060508-11; Glasgow(GB)》|2006年|P.565-574|共10页
会议地点 Glasgow(GB)
作者
Gilcheol Park; Seoksoo Kim;
展开▼
作者单位

Dept.of Multimedia Engineering, Hannam University, Daejeon, South Korea;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Web-Based Text Classification in the Absence of Manually Labeled Training Documents [J] . Chen-Ming Hung, Lee-Feng Chien Journal of the American Society for Information Science and Technology . 2007,第1期

机译：缺少手动标记的培训文档的基于Web的文本分类
2. Multiple sets of features for automatic genre classification of web documents [J] . Chul Su Lim, Kong Joo Lee, Gil Chang Kim Information Processing & Management . 2005,第5期

机译：Web文档自动类型分类的多套功能
3. On the influence of training data quality on text document classification using machine learning methods [J] . Jyri Saarikoski, Henry Joutsijoki, Kalervo Jaervelin, International Journal of Knowledge Engineering and Data Mining . 2015,第2期

机译：训练数据质量对机器学习方法对文本文档分类的影响
4. Web Document Classification Using Changing Training Data Set [C] . Gilcheol Park, Seoksoo Kim International Conference on Computational Science and Its Applications . 2006

机译：使用更改培训数据集进行Web文档分类
5. From document clues to descriptive metadata: Document characteristics used by graduate students in judging the usefulness of Web documents. [D] . Lan, Wen-Chin. 2002

机译：从文档线索到描述性元数据：研究生在判断Web文档有用性时使用的文档特征。
6. geneCommittee: a web-based tool for extensively testing the discriminatory power of biologically relevant gene sets in microarray data classification [O] . Miguel Reboiro-Jato, Joel P Arrais, José Luis Oliveira, 2014

机译：geneCommittee：一种基于网络的工具可广泛测试微阵列数据分类中生物学相关基因集的区分能力
7. American College of Cardiology training statement on recommendations for the structure of an optimal adult interventional cardiology training program11Endorsed by the Society for Cardiac Angiography and Interventions and the Diagnostic and Interventional Catheterization Committee on the Council on Clinical Cardiology, American Heart Association22This document was approved by the American College of Cardiology Board of Trustees in September 1999. Address for Reprints: This document is available on the Website of the American College of Cardiology (www.acc.org). Reprints of this document may be purchased for $5.00 each by calling 1-800-253-4636, ext 694 or by writing to the American College of Cardiology, The Resource Center, 9111 Old Georgetown Road, Bethesda, Maryland 20814-1699. A report of the American College of Cardiology Task Force on Clinical Expert Consensus documents [O] . Hirshfeld John W, Banas John S, Cowley Michael, 1999

机译：美国心脏病学院关于最佳成人介入心脏病学培训计划结构建议的培训声明11由心脏血管造影和介入学会以及美国心脏协会临床心脏病学委员会诊断和介入导管委员会认可22该文件已获得美国批准心脏病学院董事会，1999年9月。转载地址：该文档可在美国心脏病学院的网站（www.acc.org）上找到。致电1-800-253-4636，内线694或写信给美国心脏病学会资源中心9111 Old Georgetown Road，贝塞斯达，马里兰州20814-1699，可以每本5.00美元购买本文档的重印本。美国心脏病学会临床专家共识文件工作组的报告

Web Document Classification Using Changing Training Data Set

摘要

著录项

相似文献

相关主题

期刊订阅