Sequential routing framework: Fully capsule network-based speech recognition

Kyungmin Lee; Hyunwhan Joe; Hyeontaek Lim; Kwangyoun Kim; Sungsoo Kim; Chang Woo Han; Hong-Gee Kim

首页> 外文期刊>Computer speech and language >Sequential routing framework: Fully capsule network-based speech recognition

【24h】

Sequential routing framework: Fully capsule network-based speech recognition

机译：顺序路由框架：完全胶囊网络的语音识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Capsule networks (CapsNets) have recently gotten attention as a novel neural architecture. This paper presents the sequential routing framework which we believe is the first method to adapt a CapsNet-only structure to sequence-to-sequence recognition. Input sequences are capsulized then sliced by a window size. Each slice is classified to a label at the corresponding time through iterative routing mechanisms. Afterwards, losses are computed by connec-tionist temporal classification (CTC). During routing, the required number of parameters can be controlled by the window size regardless of the length of sequences by sharing learnable weights across the slices. We additionally propose a sequential dynamic routing algorithm to replace traditional dynamic routing. The proposed technique can minimize decoding speed degradation caused by the routing iterations since it can operate in a non-iterative manner without dropping accuracy. The method achieves a 1.1％ lower word error rate at 16.9% on the Wall Street Journal corpus compared to bidirectional long short-term memory-based CTC networks. On the TIMIT corpus, it attains a 0.7% lower phone error rate at 17.5% compared to convolutional neural network-based CTC networks (Zhang et al., 2016).

机译：胶囊网络（Capsnets）最近被关注作为一种新型神经结构。本文介绍了我们相信的顺序路由框架是第一种调整帽子的结构与序列到序列识别的方法。输入序列被覆盖，然后通过窗口尺寸切成切片。通过迭代路由机制将每个切片分类为相应时间的标签。之后，通过连接仪时间分类（CTC）计算损失。在路由期间，可以通过窗口大小来控制所需的参数数量，而不管横跨切片上的可读权重。我们还提出了一种顺序动态路由算法来取代传统的动态路由。所提出的技术可以最小化由路由迭代引起的解码速度劣化，因为它可以以非迭代方式操作而不会降低精度。与基于双向的长短期内存的CTC网络相比，该方法在华尔街日记帐语中达到了1.1％的误差率为16.9％。在Timit Corpus上，与基于卷积神经网络的CTC网络相比，它以17.5％的电话错误率降低0.7％（Zhang等，2016）。

著录项

来源
《Computer speech and language》 |2021年第11期|101228.1-101228.18|共18页
作者
Kyungmin Lee; Hyunwhan Joe; Hyeontaek Lim; Kwangyoun Kim; Sungsoo Kim; Chang Woo Han; Hong-Gee Kim;
展开▼
作者单位

Biomedical Knowledge Engineering Laboratory Seoul National University Seoul 08826 Republic of Korea Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

Biomedical Knowledge Engineering Laboratory Seoul National University Seoul 08826 Republic of Korea;

Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

Samsung Research 56 Seongchon-gil Seocho-gu Seoul 06765 Republic of Korea;

Biomedical Knowledge Engineering Laboratory Seoul National University Seoul 08826 Republic of Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Capsule network; Automatic speech recognition; Sequence-to-sequence; Connectionist temporal classification;

机译：胶囊网络;自动语音识别;序列到序列;连接员时间分类;

相似文献

外文文献
中文文献
专利

1. COVID-CAPS: A capsule network-based framework for identification of COVID-19 cases from X-ray images [J] . Afshar Parnian, Heidarian Shahin, Naderkhani Farnoosh, Pattern recognition letters . 2020,第Octa期

机译：Covid-Caps：一种基于胶囊网络的框架，用于识别X射线图像的Covid-19案例
2. Evaluation of a novel fuzzy sequential pattern recognition tool (fuzzy elastic matching machine) and its applications in speech and handwriting recognition [J] . Shahmoradi Sina, Shouraki Saeed Bagheri Applied Soft Computing . 2018,第期

机译：一种新型模糊序列模式识别工具（模糊弹性匹配机）及其在语音和手写识别中的应用
3. Recurrent neural network-based speech recognition using MATLAB [J] . Praveen Edward James, Mun Hou Kit, Chockalingam Aravind Vaithilingam, International Journal of Intelligent Enterprise . 2020,第1a2a3期

机译：使用MATLAB的经常性神经网络的语音识别
4. A Sequential Contrastive Learning Framework for Robust Dysarthric Speech Recognition [C] . Lidan Wu, Daoming Zong, Shiliang Sun, IEEE International Conference on Acoustics, Speech and Signal Processing . 2021

机译：一种稳健的烦恼性语音识别的顺序对比学习框架
5. Using Capsule Networks for Image and Speech Recognition Problems [D] . Xiong, Yan. 2018

机译：使用胶囊网络进行图像和语音识别问题
6. COVID-FACT: A Fully-Automated Capsule Network-Based Framework for Identification of COVID-19 Cases from Chest CT Scans [O] . Shahin Heidarian, Parnian Afshar, Nastaran Enshaei, 2021

机译：covid - 事实：一个全自动胶囊网络的基于网络的框架用于识别Covid-19胸部CT扫描
7. Sequential routing framework: Fully capsule network-based speech recognition [O] . Kyungmin Lee, Hyunwhan Joe, Hyeontaek Lim, 2021

机译：顺序路由框架：完全胶囊网络的语音识别

Sequential routing framework: Fully capsule network-based speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅