首页> 外文会议>2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing >Indonesian news auto summarization in infrastructure development topic using 5W+1H consideration
【24h】

Indonesian news auto summarization in infrastructure development topic using 5W+1H consideration

机译:基于5W + 1H考虑的基础设施发展主题中的印尼新闻自动摘要

获取原文
获取原文并翻译 | 示例

摘要

With an average reading speed of 200-500 words per minute, at least human takes 2 to 3 minutes to read and understand one news in online media. The number of news updates on an online media in a few minutes can be a lot and it's time-consuming if a reader has to read the contents of all the news. Reading a summary that represents the main idea of the news can be a solution to save time. This study considers the 5W + 1H element in generating news summaries because this element is important in a news. The single news from online media pages is taken by scanning and grabbing process which is further will be sanitized, then segmentation and tokenizing to break the news into sentences and words. Each sentence classified into multi-label whether it contains 5W + 1H (What, Who, Where, When, Why and/or How) or nothing else by using training data that has been built. Sentences containing 5W + 1H will be selected as summary sentences. Testing of summary results shows the average precision 91%, recall 67% and f-measure 76%.
机译:以平均每分钟200-500个单词的阅读速度,至少人类需要2到3分钟才能阅读和理解在线媒体中的一条新闻。几分钟内在线媒体上新闻更新的数量可能很多,如果读者必须阅读所有新闻的内容,这将非常耗时。阅读代表新闻主要思想的摘要可以作为节省时间的解决方案。本研究在生成新闻摘要时考虑了5W + 1H元素,因为该元素在新闻中很重要。在线媒体页面上的单个新闻是通过扫描和抓取过程来获取的,该过程将被进一步净化,然后进行分段和标记以将新闻分解为句子和单词。通过构建的训练数据,每个分为多标签的句子是否包含5W + 1H(什么,谁,哪里,什么时候,为什么和/或如何)或没有其他内容。包含5W + 1H的句子将被选择为摘要句子。汇总结果的测试显示平均精度为91%,召回率为67%,f测度为76%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号