基于 Python 自然语言处理工具包在语料库研究中的运用

刘旭

首页> 中文期刊> 《昆明冶金高等专科学校学报》 >基于 Python 自然语言处理工具包在语料库研究中的运用

基于 Python 自然语言处理工具包在语料库研究中的运用

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

According to the current domestic corpus based study,AntConc and PowerGREP are the main research tool.Few studies were done using the Python language NLTK packet for data processing and a-nalysis.It can not provide support to the research methods due to the design defect of the software.The Python language NLTK handling package was used in the study so that the data have uniform standards, avoiding the conversion of various types of word processing workshop trouble.It also makes up for the weakness of the range tool such as syntactic analysis,graphic,regular expression search etc.In this pa-per,it was briefly introduced that the application of NLTK processing package based on Python in corpus research.Then it takes the novel Emma written by Austen in Gutenberg corpus as an example to explain how to use the natural language processing to process the data.%国内当前以语料库为基础的研究，在研究工具方面，多以 AntConc、PowerGREP 为主，使用 Python 语言 NLTK 包进行数据处理分析的研究较少，限于软件自身设计，不能灵活地对研究方法提供支持。在研究中使用 Python 语言的 NLTK 处理包，使数据有了统一标准，避免了各类文字处理转换的麻烦，同时也弥补了Range 等工具在句法分析、图形绘制、正则表达式检索等方面的缺憾。针对语料库研究的中文本分词、词形归并、文本检索统计等主要环节，简要介绍 Python 语言的 NLTK 自然语言处理包在语料库研究中的运用，并以古腾堡语料库中的简·奥斯丁小说《艾玛》为例，说明如何运用该自然语言处理包对语料进行加工处理。

著录项

来源
《昆明冶金高等专科学校学报》 |2015年第5期|65-69,93|共6页
作者
刘旭;
展开▼
作者单位

云南师范大学外国语学院;

云南昆明 650500;

展开▼
原文格式 PDF
正文语种 chi
中图分类文字信息处理;
关键词
Python; NLTK 工具包; 语料库研究;

相似文献

中文文献
外文文献
专利

1. 基于语料库研究学术词汇在研究生学术写作中的运用 [J] . 梁婕1 . 国外英语考试教学与研究 . 2019,第002期
2. 基于高阶思维培养的微资源包在习题教学中的运用 [J] . 周志林 . 物理通报 . 2017,第008期
3. 金融危机中货币政策工具的创新运用及对我国的启示--基于美联储货币政策工具创新的实践分析 [J] . 林瑶 ,成丽丽 ,马剑飞 . 改革与开放 . 2013,第011期
4. 金融危机中货币政策工具的创新运用及对我国的启示——基于美联储货币政策工具创新的实践分析 [J] . 林瑶 ,成丽丽 ,马剑飞 . 改革与开放 . 2013,第006期
5. 高中化学：如何基于教育教学中的真实问题，基于大量、客观、可靠的事实和数据,运用科学方法、使用科学工具进行教学研究，构建基于事实和数据的教学诊断和教师专业发展诊断的教研 [J] . 蔡子华 . 黑龙江教育：中学版 . 2017,第001期
6. 基于Python的等级测评辅助工具的设计与开发 [C] . 吕松阳 ,唐正伟 ,张建华 . 2020中国网络安全等级保护和关键信息基础设施保护大会 . 2020
7. 计算语言学视域下的中-美-韩三国科技期刊英文摘要作者身份研究——基于自然语言处理工具Coh-Metrix和Gramulator的分析 [A] . 叶丹敏 . 2015

基于 Python 自然语言处理工具包在语料库研究中的运用

摘要

著录项

相似文献

相关主题

期刊订阅