首页> 外文会议>Systems, Man and Cybernetics (SMC), 2008 IEEE International Conference on >Genre identification of Chinese finance text using machine learning method
【24h】

Genre identification of Chinese finance text using machine learning method

机译:基于机器学习方法的中文金融课文体裁识别

获取原文

摘要

Document genre information is one of the most distinguishing features in information retrieval, which brings order to the search results. What the genre classification concerned is not the topic but the genre of document. In this paper, we examine the effectiveness of using machine learning techniques to solve genre classification of Chinese text with the same topic, viz. finance. Based on the likelihood ratio test, we present a new method for selecting feature terms, which can improve the performance clearly and perform better than others with up to 80% terms removal. In empirical results with SVMs classifier on the real world corpora, we find that this method can gain a better selecting effect and likelihood ratio is a reliable measure for selecting informative features.
机译:文档类型信息是信息检索中最显着的特征之一,它使搜索结果具有顺序。有关的体裁分类不是主题而是文档的体裁。在本文中,我们研究了使用机器学习技术解决具有相同主题的中文文本的体裁分类的有效性。金融。在似然比检验的基础上,我们提出了一种新的特征词选择方法,该方法可以显着提高性能,并且在去除高达80%的词时比其他方法表现更好。在真实世界语料库上使用SVM分类器进行的实证结果中,我们发现该方法可以获得更好的选择效果,似然比是选择信息特征的可靠方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号