Distributed Distributional Similarities of Google Books over the Centuries

机译：跨世纪Google图书的分布式分布相似性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper introduces a distributional thesaurus and sense clusters computed on the complete Google Syntactic N-grams, which is extracted from Google Books, a very large corpus of digitized books published between 1520 and 2008. We show that a thesaurus computed on such a large text basis leads to much better results than using smaller corpora like Wikipedia. We also provide distributional thesauri for equal-sized time slices of the corpus. While distributional thesauri can be used as lexical resources in NLP tasks, comparing word similarities over time can unveil sense change of terms across different decades or centuries, and can serve as a resource for diachronic lexicography. Thesauri and clusters are available for download.

机译：本文介绍了根据完整的Google语法N-gram计算的分布词库和有义类，这些词类是从Google图书中提取的，该图书是1520年至2008年之间出版的非常大型的数字化图书集。与使用较小的语料库（如Wikipedia）相比，基础结果要好得多。我们还为语料库的相等大小的时间片提供分布式叙词表。尽管分布式叙词表可以用作NLP任务中的词汇资源，但是随着时间的推移比较单词相似度可以揭示不同年代或几个世纪中术语的意义变化，并且可以用作历时词典词典的资源。叙词表和群集可供下载。

著录项

来源
《9th International conference on language resources and evaluation》|2014年|1513-1517|共5页
会议地点
作者
Martin Riedl; Richard Steuer; Chris Biemann;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Distributional Thesaurus; Semantics; Large-Scale Distributional Methods; Word Similarity; Lexical Resources;

机译：分配词库;语义学大规模分配方法;单词相似度;词汇资源;

相似文献

外文文献
中文文献
专利

1. Google Books, Libraries, and Self-Respect: Information Justice beyond Distributions [J] . Hoffmann Anna Lauren The Library Quarterly: A Journal of Investigation and Discussion in the Field of Library Science . 2016,第1期

机译：Google图书，图书馆和自我尊重：分布之外的信息公正
2. Can the Impact of Non-Western Academic Books be Measured? An Investigation of Google Books and Google Scholar for Malaysia [J] . A. Abrizah, Mike Thelwall Journal of the American Society for Information Science and Technology . 2014,第12期

机译：可以衡量非西方学术著作的影响吗？马来西亚Google图书和Google Scholar调查
3. Will Google Books Library Project End Copyright?: Millions of magazines hidden in Google Books Library Project endanger U.S. copyright [J] . Barbara Kevles AALL Spectrum . 2013,第7期

机译：Google书籍图书馆项目最终版权所有吗？：数百万杂志隐藏在Google Books图书馆项目危及美国版权
4. Distributed Distributional Similarities of Google Books over the Centuries [C] . Martin Riedl, Richard Steuer, Chris Biemann 9th International conference on language resources and evaluation . 2014

机译：几个世纪以来Google书籍的分布式分配相似之处
5. Integrating Google Apps and Google Chromebooks into the Core Curriculum: A Phenomenological Study of the Lived Experience of Public School Teachers [D] . Bartolo, Paula. 2017

机译：将Google Apps和Google Chromebook集成到核心课程中：对公立学校教师生活经历的现象学研究
6. Catalogue of Medical Books in Manchester University Library 1480-1700 A Catalogue of Incunabula and Sixteenth Century Books in the National Library of Medicine; First Supplement and A Catalogue of Books before 1700 in the Moody Medical Library [O] . Richard J. Wolfe 1972

机译：曼彻斯特大学图书馆的医学书籍目录1480-1700年国家医学图书馆的 incunabula和十六世纪书籍目录；穆迪医学图书馆的第一本增刊和1700年之前的书籍目录
7. Similarity in school textbooks on natural sciences for the primary school level : an analysis of teaching and apprenticeship of botany in the last century in Portugal (1900-2000) [O] . Guimarães Fernando 2009

机译：小学阶段自然科学学校教科书的相似性：上个世纪葡萄牙（190-2000年）植物学的教学和学徒情况分析

Distributed Distributional Similarities of Google Books over the Centuries

摘要

著录项

相似文献

相关主题

期刊订阅