Recency and quality-based ranking question in CQAs: A Stack Overflow case study

Leandro Amancio; Carina F. Dorneles; Daniel H. Dalip

首页> 外文期刊>Information Processing & Management >Recency and quality-based ranking question in CQAs: A Stack Overflow case study

【24h】

Recency and quality-based ranking question in CQAs: A Stack Overflow case study

机译：CQAS中的新近度和基于质量的排名问题：堆栈溢出案例研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recency ranking, in Community-based Question Answering (CQA), would refer to put recent answers in a list's top positions. To be recent is not related to how new is the date of creation or editing of a given answer, but how current is the content of the answer. A good ranking should also consider the answers' quality since a current but no quality answer may be useless. Similarly, a high-quality answer, presenting adequate text and references with obsolete information, may be valueless. Combining these two issues (recency and quality) is crucial as users usually hope for current solutions and need to have fast/easy access (top items in the ranking) to the best answers to solve their problems quickly. The CQAs usually provide voting mechanisms so that the users can indicate the best quality answers. However, this method is not concerned with the recency of the answers besides being a slow and subjective process, which does not keep up with new content's dynamism. Therefore, we propose an automatic approach that, besides the quality, also considers the answer's recency to generating the ranking. We have used textual and non-textual features that indicate the answers' quality and recency, extracted from the users' answers in the CQA environment as a whole. In our approach, quality is used to classify the answers between good and poor, using a threshold value, generating two sets of answers: high quality and low quality. Then, both sets are sorted into recency order. Finally, these sets are concatenated, giving rise to the final ranking, so that the best and most current answers are in the top positions. To verify our proposal's effectiveness, we have performed a case study in Stack Overflow CQA with a set of experiments, using different combinations of characteristics and different learning to rank Stack Overflow. Then, our main contributions are: (1) an approach to ranking answers of a questions dataset on the recency and quality of an answer; (2) a thorough evaluation of 9 learning to rank algorithms, showing that Coordinate Ascent and LambdaMart have the best performance in this task; (3) a feature analysis, which has shown that features related to the age of the response contributed to improving the ranking performance taking recency and quality into account. Furthermore, as far as we know, it is the first work that considers the recency of an answer in this task.

机译：在基于社区的问题的回答（CQA）中，在基于社区的问题中排名将参考列出列表的最高职位中的最新答案。近来，与新的是创建或编辑给定答案的日期无关，但答案的内容是如何的。良好的排名也应该考虑答案的质量，因为目前但没有质量答案可能是无用的。同样，高质量的答案，呈现足够的文本和具有过时信息的引用可能是有价值的。结合这两个问题（新近和质量）至关重要，因为用户通常希望当前的解决方案，并且需要快速/轻松访问（排名中的顶级项目），以便快速解决问题的最佳答案。 CQAS通常提供投票机制，以便用户可以指示最佳质量答案。然而，除了是一种缓慢和主观的过程之外，这种方法并不涉及答案的内容，这不会跟上新的内容的活力。因此，我们提出了一种自动方法，除了质量，还考虑了答案的生成排名。我们使用了表明答案的“质量和新近度”的文本和非文本功能，从用户的答案中提取了整个CQA环境中的答案。在我们的方法中，质量用于将良好和差之间的答案分类，使用阈值，产生两组答案：高质量和低质量。然后，两组都分类为新订单。最后，这些集合被连接，引起最终排名，使最佳，最电流的答案在顶部位置。为了验证我们的建议的有效性，我们在堆栈溢出CQA中进行了一个案例研究，其中一组实验，使用不同的特征组合和不同学习来排名堆栈溢出。然后，我们的主要贡献是：（1）在答案的新近调和质量上排名对数据集的答案的方法; （2）彻底评估9学习对算法的学习，表明坐标上升和Lambdamart在这项任务中具有最佳性能; （3）一个特征分析，表明，与响应年龄相关的功能有助于提高排名绩效，以考虑到账户的排名和质量。此外，据我们所知，它是第一个考虑此任务中答案的新的工作。

著录项

来源
《Information Processing & Management》 |2021年第4期|102552.1-102552.18|共18页
作者
Leandro Amancio; Carina F. Dorneles; Daniel H. Dalip;
展开▼
作者单位

Departamento de Informatica e Estatistica - Universidade Federal de Santa Catarina Florianopolis Brazil;

Departamento de Informatica e Estatistica - Universidade Federal de Santa Catarina Florianopolis Brazil;

Computing Department - Centro Federal de Educacao Tecnologica de Minos Gerais Belo Horizonte Minas Gerais Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Community-based question answering; CQA ranking; Recency ranking; Quality ranking; Learning to rank; Recency features; Quality features; Textual features; Non-textual features;

机译：基于社区的问题回答;CQA排名;新近度排名;质量排名;学习排名;新近特征;质量特征;文本特征;非文本功能;

相似文献

外文文献
中文文献
专利

1. Belief Measure of Expertise for Experts Detection in Question Answering Communities: case study Stack Overflow [J] . Dorra Attiaoui, Arnaud Martin, Boutheina Ben Yaghlane Procedia Computer Science . 2017,第1期

机译：问答社区中专家检测的专业知识信念度量：案例研究堆栈溢出
2. What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts [J] . Xin-Li Yang, David Lo, Xin Xia, 计算机科学技术学报（英文版） . 2016,第005期

机译：开发人员会问什么安全问题？堆垛溢流桩的大规模研究
3. Answers or no answers: Studying question answerability in Stack Overflow [J] . Alton Y.K. Chua, Snehasish Banerjee Journal of Information Science . 2015,第5期

机译：答案或没有答案：研究堆栈溢出中的问题可回答性
4. How Developers and Tools Categorize Sentiment in Stack Overflow Questions - A Pilot Study [C] . Niloofar Mansoor, Cole S. Peterson, Bonita Sharif IEEE/ACM International Workshop on Emotion Awareness in Software Engineering . 2021

机译：开发人员和工具如何在堆栈溢出问题中分类情绪 - 试点研究
5. Study of Outdated Cryptography Algorithms Posts of Stack Overflow [D] . Kharche, Shraddha. 2021

机译：堆栈溢流过期加密算法的研究
6. A relevance and quality-based ranking algorithm applied to evidence-based medicine [O] . Jesus Serrano-Guerrero, Francisco P. Romero, Jose A. Olivas -1

机译：基于相关度和质量的排序算法应用于循证医学
7. What network simulator questions do users ask? a large-scale study of stack overflow posts [O] . Syful Islam, Yusuf Sulistyo Nugroho, Md. Javed Hoss 2021

机译：用户询问哪些网络模拟器问题？堆栈溢出柱的大规模研究

Recency and quality-based ranking question in CQAs: A Stack Overflow case study

摘要

著录项

相似文献

相关主题

期刊订阅