诺库维普

登录注册搜索设为首页

博文
相关博文

Extending Average Precision to Graded Relevance Judgments

主要贡献: 提出了基于MAP改进的GAP排序指标, 该指标可被用于graded relevance domains

对于GAP的理解有几点需要注意:

这里所谓的支持multi-graded relevance数据的建模，并不是我们直观意义上的像nDCG那样的直接将multi-graded relevance融入到function当中
而是依旧从binary relevance出发，将multi-graded relevance 强制分为两类，需要设定一个分为两类的阈值
作者认为，阈值设定为多少，是遵循一定分布的（文章中有详细论证）
最后在阈值的分布下求得最终的期望及GAP
有一点值得研究：文中最终的期望并不是先分别求不同阈值下的MAP，在加权求和的，具体原因文章有所描述，而将阈值对应的概率直接融合到GAP式子当中，然后再研究阈值概率对GAP的影响
如何确认阈值概率分布（这篇文章有提到一种方法： http://nuoku.vip/users/2/articles/18）

几点启发：

multi-graded relevance 可以转化成binary-graded relevance进行考虑，那么binary-graded relevance 也可以转化成multi进行考虑（利用click统计信息，以及相似用户影响，或者这两个信息相融合？）
这里所谓的multi-graded relevance一般都是整数（如：1-5星），所以如何将binary-graded relevance转化过来也知道考虑（相似用户个数可设为最大graded）
解决了上述两个问题，是否就可以优化求导求模型了？
另外，可不可以直接像nDCG那样，将multi-graded relevance融合到function当中，如这篇文章： http://nuoku.vip/users/2/articles/19

(1) A Fairness-Aware Recommender System Based On Genetic Algorithm For User, Provider and Platform Optimization

(2) SMONE: A Session-based Recommendation Model based on Neighbor Sessions with Similar Probabilistic Intentions

(3) Discrete Listwise Personalized Ranking for Fast Top-N Recommendation with Implicit Feedback

(4) MLP4Rec: A Pure MLP Architecture for Sequential Recommendations

(5) Self-supervised Graph Neural Networks for Multi-behavior Recommendation

(6) Collaborative Self-Attention Network for Session-based Recommendation

(7) Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach

(8) Discrete Embedding for Latent Networks

(9) Exploiting the Dynamic Mutual Influencefor Predicting Social Event Participation

(10) Improving Sequential Recommendation with Knowledge-Enhanced Memory Network

留言

文献基本信息

文献题目			去谷歌学术搜索
Extending Average Precision to Graded Relevance Judgments
文献作者			Stephen E. Robertson; Evangelos Kanoulas; Emine Yilmaz
文献发表年限			2010
文献关键字
information retrieval, effectiveness metrics, average precision, graded relevance, learning to rank, GAP
摘要描述
Evaluation metrics play a critical role both in the context of comparative evaluation of the performance of retrieval systems and in the context of learning-to-rank (LTR) as objective functions to be optimized. Many different evaluation metrics have been proposed in the IR literature, with average precision (AP) being the dominant one due a number of desirable properties it possesses. However, most of these measures, including average precision, do not incorporate graded relevance. In this work, we propose a new measure of retrieval effectiveness, the Graded Average Precision (GAP). GAP generalizes average precision to the case of multi-graded relevance and inherits all the desirable characteristics of AP: it has a nice probabilistic interpretation, it approximates the area under a graded precision-recall curve and it can be justified in terms of a simple but moderately plausible user model. We then evaluate GAP in terms of its informativeness and discriminative power. Finally, we show that GAP can reliably be used as an objective metric in learning to rank by illustrating that optimizing for GAP using SoftRank and LambdaRank leads to better performing ranking functions than the ones constructed by algorithms tuned to optimize for AP or NDCG even when using AP or NDCG as the test metrics.

博主信息

Nengjun 内部成员

博主其他文章

SMONE: A Session-based Recommendation Model based on Neighbor Sessions with Similar Probabilistic Intentions

Discrete Listwise Personalized Ranking for Fast Top-N Recommendation with Implicit Feedback

MLP4Rec: A Pure MLP Architecture for Sequential Recommendations

Self-supervised Graph Neural Networks for Multi-behavior Recommendation

Collaborative Self-Attention Network for Session-based Recommendation

Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach

Discrete Embedding for Latent Networks

Exploiting the Dynamic Mutual Influencefor Predicting Social Event Participation

推荐博主

KBlSGXGdiuMJocf

bTmpEplAlSjefKq

rmqDaaldyiNVfQk

ktUTayvDRRCdJvK