Development of Part of Speech Tagger for Assamese Using HMM

Surjya Kanta Daimary; Vishal Goyal; Madhumita Barbora; Umrinderpal Singh

首页> 外文期刊>International journal of synthetic emotions >Development of Part of Speech Tagger for Assamese Using HMM

【24h】

Development of Part of Speech Tagger for Assamese Using HMM

机译：使用HMM开发针对阿萨姆语的词性标注器

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article presents the work on the Part-of-Speech Tagger for Assamese based on Hidden Markov Model (HMM). Over the years, a lot of language processing tasks have been done for Western and South-Asian languages. However, very little work is done for Assamese language. So, with this point of view, the POS Tagger for Assamese using Stochastic Approach is being developed. Assamese is a free word-order, highly agglutinate and morphological rich language, thus developing POS Tagger with good accuracy will help in development of other NLP task for Assamese. For this work, an annotated corpus of 271,890 words with a BIS tagset consisting of 38 tag labels is used. The model is trained on 256,690 words and the remaining words are used in testing. The system obtained an accuracy of 89.21% and it is being compared with other existing stochastic models.

机译：本文介绍了基于隐马尔可夫模型（HMM）的Assamese词性标注器的工作。多年来，已经为西方和南亚语言完成了许多语言处理任务。但是，阿萨姆语的工作很少。因此，从这种观点出发，正在开发使用随机方法的阿萨姆语POS塔格。阿萨姆语是一种自由词序，高度凝集且形态丰富的语言，因此开发具有良好准确性的POS Tagger将有助于阿萨姆语其他NLP任务的开发。对于这项工作，使用了一个带有271,890个单词的带注释的语料库，其BIS标签集由38个标签标签组成。该模型使用256,690个单词训练，其余单词用于测试。该系统的精度为89.21％，正在与其他现有的随机模型进行比较。

著录项

来源
《International journal of synthetic emotions》 |2018年第1期|23-32|共10页
作者
Surjya Kanta Daimary; Vishal Goyal; Madhumita Barbora; Umrinderpal Singh;
展开▼
作者单位

Department of Computer Science, Punjabi University, Patiala, India;

Department of Computer Science, Punjabi University, Patiala, India;

Department of English and Foreign Languages, Tezpur University, Tezpur, India;

Department of Computer Science, Punjabi University Patiala, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Assamese; BIS Tagset; Hidden Markov Model; Natural Language Processing; Part of Speech Tagger;

机译：阿萨姆语;BIS标签集;隐马尔可夫模型;自然语言处理;词性标注器;

相似文献

外文文献
中文文献
专利

1. Improving part-of-speech tagging using lexicalized HMMs [J] . FERRAN PLA, ANTONIO MOLINA Natural language engineering . 2004,第Jun期

机译：使用词法化的HMM改进词性标记
2. Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK [J] . HIMANGSHU SARMA, NAVANATH SAHARIA, UTPAL SHARMA ACM transactions on Asian language information processing . 2018,第1期

机译：使用HTK的阿萨姆语语音识别系统的开发和分析
3. Nonnegative HMM for Babble Noise Derived From Speech HMM: Application to Speech Enhancement [J] . Mohammadiha N., Leijon A. Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第5期

机译：用于语音HMM产生的MM声的非负HMM：在语音增强中的应用
4. Improving Hidden Markov Model for very low resource languages: An analysis for Assamese parts of speech tagging [C] . Diganta Baishya, Rupam Baruah International Conference on Cloud Computing, Data Science and Engineering . 2021

机译：改进隐藏的马尔可夫模型非常低的资源语言：assamese的语音标记部分分析
5. Statistics of nonlinear averaging spectral estimators and a novel distance measure for HMMs with application to speech quality estimation. [D] . Liang, Hongkang. 2005

机译：非线性平均频谱估计器的统计数据和HMM的新型距离测度，并应用于语音质量估计。
6. GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data [O] . João M. P. Alves, André L. de Oliveira, Tatiana O. M. Sandberg, -1

机译：GenSeed-HMM：使用轮廓HMM作为种子进行渐进组装的工具并在根据元基因组数据发现Alpavirinae病毒中的应用
7. Part of Speech Tagger for Assamese Text [O] . Navanath Saharia, Dhrubajyoti Das, Utpal Sharma, 2010

机译：阿萨姆语文本的部分语音标记

Development of Part of Speech Tagger for Assamese Using HMM

摘要

著录项

相似文献

相关主题

期刊订阅