首页> 外文期刊>International journal of data mining, modelling and management >A lexical-semantics-based method for multi label text categorisation using word net
【24h】

A lexical-semantics-based method for multi label text categorisation using word net

机译:基于词法语义的词网多标签文本分类方法

获取原文
获取原文并翻译 | 示例
           

摘要

Text categorisation is an upcoming area in the field of text mining. The text documents possess huge number of features due to their unstructured nature. In this paper, an algorithm for multi label categorisation of text documents based on the concepts of lexical and semantics using word net (MC-LSW) is proposed. The proposed algorithm is based on the concepts of lexical (tokens) and semantics of a language. It aims at minimising the number of tokens used for categorising text documents. MC-LSW uses word net to extract the semantic information of tokens. The proposed algorithm is implemented and tested on five datasets of text domain and is compared with the existing multi label categorisation algorithms. The proposed algorithm (MC-LSW) shows more efficient and promising results in terms of space and time complexity than the existing methods. Accuracy and precision measures have been improved by the proposed algorithm as well as hamming loss has been reduced.
机译:文本分类是文本挖掘领域中即将出现的领域。文本文档由于其非结构化的性质而具有大量的功能。本文提出了一种基于词法和语义概念的文本网络文本多标签分类算法。所提出的算法基于词法(令牌)和语言语义的概念。它旨在最小化用于对文本文档进行分类的令牌数量。 MC-LSW使用词网提取令牌的语义信息。该算法在五个文本域数据集上实现并测试,并与现有的多标签分类算法进行了比较。所提出的算法(MC-LSW)在空间和时间复杂度方面比现有方法显示出更加有效和有希望的结果。该算法提高了精度和精度,减少了汉明损失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号