首页> 外文会议>East Indonesia Conference on Computer and Information Technology >Indonesian Abstractive Summarization using Pre-trained Model
【24h】

Indonesian Abstractive Summarization using Pre-trained Model

机译:使用预先训练模型的印度尼西亚抽象摘要

获取原文

摘要

Automatic text summarization systems are increasingly needed to encounter the information explosion caused by internet growth. Since Indonesian is still considered an under-resourced language, we take advantage of pre-trained language models to perform abstractive summarization. This paper investigates the BERT performance given the Indonesian article by comparing several BERT pre-trained models and evaluated the results based on the ROUGE values. Our experiment shows that an English pre-trained model can produce a good summary given Indonesian text, but it is more effective for using the Indonesian pre-trained model. The default training model only with the abstractive objective is better than using two-stage fine-tuning, where the extractive model must be trained in advance. We also found a lot of meaningless words in the summary words construction. This finding is the result of a preliminary study to improve the Indonesian abstractive summarization model.
机译:越来越需要自动文本摘要系统来遇到因互联网增长引起的信息爆炸。 由于印度尼西亚仍被认为是一个资源不足的语言,我们利用预先训练的语言模型来执行抽象摘要。 本文通过比较若干BERT预训练模型并根据胭脂值进行评估结果,调查伯特性能。 我们的实验表明,给出了英语预先训练的模型,给出了印度尼西亚语文本的良好摘要,但它更有效地使用印度尼西亚预先训练的模型。 只有抽象目标的默认培训模型比使用两级微调更好,其中必须提前培训提取型号。 我们还在摘要单词施工中找到了很多无意义的词语。 这一发现是提高印度尼西亚抽象摘要模型初步研究的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号