首页> 外文期刊>Computer speech and language >Sequential routing framework: Fully capsule network-based speech recognition
【24h】

Sequential routing framework: Fully capsule network-based speech recognition

机译:顺序路由框架:完全胶囊网络的语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

Capsule networks (CapsNets) have recently gotten attention as a novel neural architecture. This paper presents the sequential routing framework which we believe is the first method to adapt a CapsNet-only structure to sequence-to-sequence recognition. Input sequences are capsulized then sliced by a window size. Each slice is classified to a label at the corresponding time through iterative routing mechanisms. Afterwards, losses are computed by connec-tionist temporal classification (CTC). During routing, the required number of parameters can be controlled by the window size regardless of the length of sequences by sharing learnable weights across the slices. We additionally propose a sequential dynamic routing algorithm to replace traditional dynamic routing. The proposed technique can minimize decoding speed degradation caused by the routing iterations since it can operate in a non-iterative manner without dropping accuracy. The method achieves a 1.1% lower word error rate at 16.9% on the Wall Street Journal corpus compared to bidirectional long short-term memory-based CTC networks. On the TIMIT corpus, it attains a 0.7% lower phone error rate at 17.5% compared to convolutional neural network-based CTC networks (Zhang et al., 2016).
机译:胶囊网络(Capsnets)最近被关注作为一种新型神经结构。本文介绍了我们相信的顺序路由框架是第一种调整帽子的结构与序列到序列识别的方法。输入序列被覆盖,然后通过窗口尺寸切成切片。通过迭代路由机制将每个切片分类为相应时间的标签。之后,通过连接仪时间分类(CTC)计算损失。在路由期间,可以通过窗口大小来控制所需的参数数量,而不管横跨切片上的可读权重。我们还提出了一种顺序动态路由算法来取代传统的动态路由。所提出的技术可以最小化由路由迭代引起的解码速度劣化,因为它可以以非迭代方式操作而不会降低精度。与基于双向的长短期内存的CTC网络相比,该方法在华尔街日记帐语中达到了1.1%的误差率为16.9%。在Timit Corpus上,与基于卷积神经网络的CTC网络相比,它以17.5%的电话错误率降低0.7%(Zhang等,2016)。

著录项

  • 来源
    《Computer speech and language》 |2021年第11期|101228.1-101228.18|共18页
  • 作者单位

    Biomedical Knowledge Engineering Laboratory Seoul National University Seoul 08826 Republic of Korea Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

    Biomedical Knowledge Engineering Laboratory Seoul National University Seoul 08826 Republic of Korea;

    Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

    Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

    Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

    Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

    Biomedical Knowledge Engineering Laboratory Seoul National University Seoul 08826 Republic of Korea;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Capsule network; Automatic speech recognition; Sequence-to-sequence; Connectionist temporal classification;

    机译:胶囊网络;自动语音识别;序列到序列;连接员时间分类;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号