Towards a stratified learning approach to predict future citation counts

机译：采取分层学习方法来预测未来的引用次数

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we study the problem of predicting future citation count of a scientific article after a given time interval of its publication. To this end, we gather and conduct an exhaustive analysis on a dataset of more than 1.5 million scientific papers of computer science domain. On analysis of the dataset, we notice that the citation count of the articles over the years follows a diverse set of patterns; on closer inspection we identify six broad categories of citation patterns. This important observation motivates us to adopt stratified learning approach in the prediction task, whereby, we propose a two-stage prediction model - in the first stage, the model maps a query paper into one of the six categories, and then in the second stage a regression module is run only on the subpopulation corresponding to that category to predict the future citation count of the query paper. Experimental results show that the categorization of this huge dataset during the training phase leads to a remarkable improvement (around 50%) in comparison to the well-known baseline system.

机译：在本文中，我们研究了在给定的出版时间间隔后预测科学文章未来引用次数的问题。为此，我们收集并对计算机科学领域超过150万篇科学论文的数据集进行详尽的分析。在对数据集进行分析时，我们注意到多年来这些文章的引文计数遵循多种模式。通过仔细检查，我们可以确定六大类引用模式。这一重要发现促使我们在预测任务中采用分层学习方法，因此，我们提出了一个两阶段的预测模型-在第一阶段，该模型将查询文件映射到六个类别之一，然后在第二阶段仅在与该类别对应的子人群上运行回归模块，以预测查询文件的将来引用次数。实验结果表明，与众所周知的基准系统相比，在训练阶段对庞大数据集的分类带来了显着的改进（大约50％）。

著录项

来源
《2014 IEEE/ACM Joint Conference on Digital Libraries》|2014年|351-360|共10页
会议地点 London(GB)
作者
Chakraborty T.; Kumar S.; Goyal P.; Ganguly N.; Mukherjee A.;
展开▼
作者单位

Dept. of Comput. Sci. Eng., Indian Inst. of Technol., Kharagpur, Kharagpur, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
citation analysis; publishing; query processing; regression analysis; baseline system; citation patterns; computer science domain; future citation count prediction; huge dataset; prediction task; publication; query paper; regression module; scientific article; scientific papers; stratified learning approach; two-stage prediction model; Abstracts; Accuracy; Computer science; Predictive models; Productivity; Support vector machines; Training;

机译：引文分析;发布;查询处理;回归分析;基线系统;引文模式;计算机科学领域;未来引文计数预测;庞大的数据集;预测任务;出版物;查询文件;回归模块;科学文章;科学论文;分层学习方法;两阶段预测模型;摘要;准确性;计算机科学;预测模型;生产率;支持向量机;培训;;

相似文献

外文文献
中文文献
专利

1. Can we predict citation counts of environmental modelling papers? Fourteen bibliographic and categorical variables predict less than 30% of the variability in citation counts [J] . Robson Barbara J., Mousques Aurelie Environmental Modelling & Software . 2016,第JANa期

机译：我们可以预测环境建模论文的引用次数吗？ 14种书目和分类变量预测不到30％的引用次数变化
2. Predicting citation counts based on deep neural network learning techniques [J] . Abrishami Ali, Aliakbary Sadegh Journal of informetrics . 2019,第2期

机译：基于深度神经网络学习技术预测引用次数
3. Predicting citation counts based on deep neural network learning techniques [J] . Abrishami Ali, Aliakbary Sadegh Journal of informetrics . 2019,第2期

机译：基于深神经网络学习技术预测引文计数
4. Towards a stratified learning approach to predict future citation counts [C] . Chakraborty T., Kumar S., Goyal P., IEEE/ACM Joint Conference on Digital Libraries . 2014

机译：走向预测未来引用计数的分层学习方法
5. Predict the Risk of Cardiovascular Diseases in the Future Using Deep Learning [D] . Jin, Ruitao 2018

机译：利用深度学习预测未来心血管疾病的风险
6. Can altmetrics predict future citation counts in critical care medicine publications? [O] . Daniel J Lehane, Colin S Black 2021

机译：Altmetrics可以预测未来的关键护理医学出版物的引文计数吗？
7. Can altmetrics predict future citation counts in critical care medicine publications? [O] . Daniel J Lehane, Colin S Black 2020

机译：Altmetrics可以预测未来的关键护理医学出版物的引文计数吗？

Towards a stratified learning approach to predict future citation counts

摘要

著录项

相似文献

相关主题

期刊订阅