首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Scaling Up Open Tagging from Tens to Thousands: Comprehension Empowered Attribute Value Extraction from Product Title
【24h】

Scaling Up Open Tagging from Tens to Thousands: Comprehension Empowered Attribute Value Extraction from Product Title

机译:将开放标记从数千扩展到数千:从产品标题中获得理解能力的属性值提取

获取原文

摘要

Supplementing product information by extracting attribute values from title is a crucial task in e-Commerce domain. Previous studies treat each attribute only as an entity type and build one set of NER tags (e.g., BIO) for each of them, leading to a scalability issue which unfits to the large sized attribute system in real world e-Commerce. In this work, we propose a novel approach to support value extraction scaling up to thousands of attributes without losing performance: (1) We propose to regard attribute as a query and adopt only one global set of BIO tags for any attributes to reduce the burden of attribute tag or model explosion; (2) We explicitly model the semantic representations for attribute and title, and develop an attention mechanism to capture the interactive semantic relations in-between to enforce our framework to be attribute comprehensive. We conduct extensive experiments in real-life datasets. The results show that our model not only outperforms existing state-of-the-art N-ER tagging models, but also is robust and generates promising results for up to 8,906 attributes.
机译:通过从标题中提取属性值来补充产品信息是电子商务领域中的一项关键任务。先前的研究仅将每个属性视为实体类型,并为每个属性构建一组NER标签(例如BIO),从而导致可伸缩性问题,不适用于现实世界电子商务中的大型属性系统。在这项工作中,我们提出了一种新颖的方法来支持将值提取扩展到数千个属性而又不损失性能:(1)我们建议将属性视为查询,并且对任何属性仅采用一组全局BIO标签以减轻负担属性标签或模型爆炸; (2)我们对属性和标题的语义表示进行了显式建模,并开发了一种注意机制来捕获它们之间的交互语义关系,以使我们的框架变得全面。我们在现实生活的数据集中进行了广泛的实验。结果表明,我们的模型不仅优于现有的最新N-ER标记模型,而且功能强大且可针对多达8,906个属性产生令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号