文献作者 | Tie-Yan Liu and Hang Li | ||||||||||
文献发表年限 | 2010 | 创建时间 | 2017-05-10 | ||||||||
文献关键字 | LETOR; L2R: IR | ||||||||||
摘要描述 | 主要介绍下L2R在IR领域中的应用,尤其区别于L2R在RecSys中的应用. Information Retriveal with Learning to Rank (problem setting) |
文献作者 | Steffen Rendle, Christoph Freudenthaler, Zeno Gantner and Lars Schmidt-Thieme | ||||||||||
文献发表年限 | 2009 | 创建时间 | 2017-05-08 | ||||||||
文献关键字 | BPR; pairwise; 抑制过拟合使得排序成为可能 | ||||||||||
摘要描述 | Item recommendation is the task of predict- ing a personalized ranking on a set of items (e.g. websites, movies, products). In this paper, we investigate the most common sce- nario with implicit feedback (e.g. clicks, purchases). There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or adaptive k- nearest-neighbor (kNN). Even though these methods are designed for the item predic- tion task of personalized ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator de- rived from a Bayesian analysis of the prob- lem. We also provide a generic learning al- gorithm for optimizing models with respect to BPR-Opt. The learning method is based on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of per- sonalized ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the im- portance of optimizing models for the right criterion. |
文献作者 | Jerome H. Friedman | ||||||||||
文献发表年限 | 2001 | 创建时间 | 2017-04-23 | ||||||||
文献关键字 | 4000 citations; gradient boosting | ||||||||||
摘要描述 | Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest- descent minimization. A general gradient descent “boosting” paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least-squares, least absolute deviation, and Huber- M loss functions for regression, and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such “TreeBoost” models are presented. Gradient boosting o fregression trees produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods o fFreund and Shapire and Friedman, Hastie and Tib- shirani are discussed |
文献作者 | Thomas G Dietterich | ||||||||||
文献发表年限 | 2000 | 创建时间 | 2017-04-23 | ||||||||
文献关键字 | Ensemble; 3000 citions; 集成学习; boosting; bootstrap; bagging (bootstap aggregating); gradient boosting; Adaboost 抽样方法 | ||||||||||
摘要描述 | Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include error-correcting output coding, Bagging, and boosting. This paper reviews these methods and explains why ensembles can often perform better than an y single classifier. Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly. |
文献作者 | Qiang Liu, Shu Wu, Liang Wang | ||||||||||
文献发表年限 | 2015 | 创建时间 | 2017-04-18 | ||||||||
文献关键字 | MF; Tensor; nlp | ||||||||||
摘要描述 | With rapid growth of information on the Internet, recommender systems become fundamental for helping users alleviate the problem of information overload. Since contextual information can be used as a significant factor in modeling user behavior, various context-aware recommendation methods are proposed. However, the state-of-the-art context modeling methods treat contexts as other dimensions similar to the dimensions of users and items, and cannot capture the special semantic operation of contexts. On the other hand, some works on multi-domain relation prediction can be used for the context-aware recommendation, but they have problems in generating recommendation under a large amount of contextual information. In this work, we propose Contextual Operating Tensor (COT) model, which represents the common semantic effects of contexts as a contextual operating tensor and represents a context as a latent vector. Then, to model the semantic operation of a context combination, we generate contextual operating matrix from the contextual operating tensor and latent vectors of contexts. Thus latent vectors of users and items can be operated by the contextual operating matrices. Experimental results show that the proposed COT model yields significant improvements over the competitive compared methods on three typical dataset- s, i.e., Food, Adom and Movielens-1M datasets. |
文献作者 | Paul Mangiameli | ||||||||||
文献发表年限 | 2004 | 创建时间 | 2017-04-14 | ||||||||
文献关键字 | Model selection; Medical diagnosis; Neural networks; Bootstrap aggregating models; Diverse ensembles; Baseline ensembles; Bagging models | ||||||||||
摘要描述 | In this paper, we examine the model section decision for a medical diagnostic decision support system (MDSS). Our purpose in doing this is to understand how model selection affects the accuracy of the decision support system. We explore two related research questions: (1) Do ensembles of models, acting as a single decision maker, perform more accurately than single models; and (2) How does model diversity affect the accuracy of the ensembles? Specifically, we compare 23 single models and bootstrap aggregating (i.e., bagging) models for their predictive abilities across five diverse medical data sets. We are able to reach important conclusions about our research objectives. Ensembles are more accurate than single models in their predictive ability. The best ensemble model achieves an error level significantly lower than the error of the best single model for four of the five medical applications analyzed. The magnitude of the error reduction ranges from 6.4% to 17.5%. Also, when designing an ensemble for an MDSS, the decision to diversify the model selection should be guided by the relationship between model instability and generalization error for the population of models under consideration. |
文献作者 | Xiangnan He; Hanwang Zhang | ||||||||||
文献发表年限 | 2016 | 创建时间 | 2017-04-12 | ||||||||
文献关键字 | Matrix Factorization, Implicit Feedback, Item Recommen- dation, Online Learning, ALS, Coordinate Descent | ||||||||||
摘要描述 | This paper contributes improvements on both the effective- ness and efficiency of Matrix Factorization (MF) methods for implicit feedback. We highlight two critical issues of ex- isting works. First, due to the large space of unobserved feedback, most existing works resort to assign a uniform weight to the missing data to reduce computational com- plexity. However, such a uniform assumption is invalid in real-world settings. Second, most methods are also designed in an offline setting and fail to keep up with the dynamic nature of online data. We address the above two issues in learning MF models from implicit feedback. We first propose to weight the miss- ing data based on item popularity, which is more effective and flexible than the uniform-weight assumption. However, such a non-uniform weighting poses efficiency challenge in learning the model. To address this, we specifically de- sign a new learning algorithm based on the element-wise Alternating Least Squares (eALS) technique, for efficiently optimizing a MF model with variably-weighted missing data. We exploit this efficiency to then seamlessly devise an incre- mental update strategy that instantly refreshes a MF model given new feedback. Through comprehensive experiments on two public datasets in both offline and online protocols, we show that our eALS method consistently outperforms state-of-the-art implicit MF methods. Our implementation is available at https://github.com/hexiangnan/sigir16-eals. |
文献作者 | Xia Ning and George Karypis | ||||||||||
文献发表年限 | 2011 | 创建时间 | 2017-04-12 | ||||||||
文献关键字 | Top-N Recommender Systems, Sparse Linear Meth- ods, l1 -norm Regularization | ||||||||||
摘要描述 | This paper focuses on developing effective and efficient algorithms for top-N recommender systems. A novel Sparse LInear Method ( SLIM ) is proposed, which generates top- N recommendations by aggregating from user purchase/rating profiles. A sparse aggregation coefficient matrix W is learned from SLIM by solving an l1 -norm and l2 -norm regularized optimization problem. W is demonstrated to produce high- quality recommendations and its sparsity allows SLIM to generate recommendations very fast. A comprehensive set of experiments is conducted by comparing the SLIM method and other state-of- the-art top-N recommendation methods. The experiments show that SLIM achieves significant improvements both in run time performance and recommendation quality over the best existing methods. |
文献作者 | George Karypis; Xia Ning | ||||||||||
文献发表年限 | 2012 | 创建时间 | 2017-04-12 | ||||||||
文献关键字 | 实验详细; RecSys; 指标; metric; regularization norm; MF; linear model; implicit feedback; | ||||||||||
摘要描述 | The increasing amount of side information associated with the items in E-commerce applications has provided a very rich source of information that, once properly exploited and incorporated, can significantly improve the performance of the conventional recommender systems. This paper focuses on developing effective algorithms that utilize item side in- formation for top-N recommender systems. A set of sparse linear methods with side information (SSLIM) is proposed, which involve a regularized optimization process to learn a sparse aggregation coefficient matrix based on both user-item purchase profiles and item side information. This aggregation coefficient matrix is used within an item-based recommendation framework to generate recommendations for the users. Our experimental results demonstrate that SSLIM outperforms other methods in effectively utilizing side information and achieving performance improvement. |
文献作者 | Evangelia Christakopoulou and George Karypis | ||||||||||
文献发表年限 | 2016 | 创建时间 | 2017-04-12 | ||||||||
文献关键字 | RecSys 2016 best paper; SLIM; global-local;GLSLIM | ||||||||||
摘要描述 | Item-based approaches based on SLIM (Sparse LInear Methods) have demonstrated very good performance for top-N recommendation; however they only estimate a single model for all the users. This work is based on the intuition that not all users behave in the same way – instead there exist subsets of like-minded users. By using different item-item models for these user subsets, we can capture differences in their preferences and this can lead to improved performance for top-N recommendations. In this work, we extend SLIM by combining global and local SLIM models. We present a method that computes the prediction scores as a user-specific combination of the predictions derived by a global and local item-item models. We present an approach in which the global model, the local models, their user-specific combination, and the assignment of users to the local models are jointly optimized to improve the top-N recommendation performance. Our experiments show that the proposed method improves upon the standard SLIM model and outperforms competing top-N recommendation approaches. |