首页> 外文会议>International Speech Communication Association >Rank-Predicted Pseudo-Greedy Approach to Efficient Text Selection From Large-Scale Corpus For Maximum Coverage of Target Units
【24h】

Rank-Predicted Pseudo-Greedy Approach to Efficient Text Selection From Large-Scale Corpus For Maximum Coverage of Target Units

机译:排名预测的伪贪婪方法,以获得大规模语料库的高效文本选择,以获得目标单位的最大覆盖

获取原文

摘要

Selecting efficiently a minimum amount of text from a large-scale text corpus to achieve a maximum coverage of certain units is an important problem in spoken language processing area. In this paper, the above text selection problem is first formulated as a maximum coverage problem with a Knapsack constraint (MCK). An efficient rank-predicted pseudo-greedy approach is then proposed to solve this problem. Experiments on a Chinese text selection task are conducted to verify the ef-ficiency of the proposed approach. Experimental results show that our approach can improve significantly the text selection speed yet without sacrificing the coverage score compared with traditional greedy approach.
机译:从大规模文本语料库中有效地选择最小的文本,以实现某些单位的最大覆盖范围是语言处理区域中的重要问题。在本文中,首先将上述文本选择问题称为具有背包约束(MCK)的最大覆盖问题。然后提出了一种有效的等级预测的伪贪婪方法来解决这个问题。进行了中国文本选择任务的实验,以验证建议方法的EF效力。实验结果表明,与传统的贪婪方法相比,我们的方法可以显着提高文本选择速度,尚未牺牲覆盖率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号