首页> 外文会议>International conference on artificial intelligence: methodology, systems, and applications >Training Datasets Collection and Evaluation of Feature Selection Methods for Web Content Filtering
【24h】

Training Datasets Collection and Evaluation of Feature Selection Methods for Web Content Filtering

机译:训练数据集收集和Web内容过滤功能选择方法的评估

获取原文

摘要

This paper focuses on the main aspects of development of a qualitative system for dynamic content filtering. These aspects include collection of meaningful training data and the feature selection techniques. The Web changes rapidly so the classifier needs to be regularly re-trained. The problem of training data collection is treated as a special case of the focused crawling. A simple and easy-to-tune technique was proposed, implemented and tested. The proposed feature selection technique tends to minimize the feature set size without loss of accuracy and to consider interlinked nature of the Web. This is essential to make a content filtering solution fast and non-burdensome for end users, especially when content filtering is performed using a restricted hardware. Evaluation and comparison of various classifiers and techniques are provided.
机译:本文关注于动态内容过滤定性系统开发的主要方面。这些方面包括有意义的训练数据和特征选择技术的收集。 Web快速变化,因此需要定期对分类器进行重新训练。训练数据收集的问题被视为集中爬网的特例。提出,实施和测试了一种简单易调的技术。所提出的特征选择技术趋向于在不损失准确性的情况下最小化特征集的大小,并倾向于考虑Web的互连性质。这对于使内容过滤解决方案对于最终用户而言快速且不繁琐是必不可少的,尤其是在使用受限硬件执行内容过滤时。提供了各种分类器和技术的评估和比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号