本文核心思想:先通过矩阵分解得到用户和物品的隐因子矩阵(向量),然后建立物品的side information相关content-based 特征向量,让后建立隐因子向量和content-based向量之间的联系(mapping),从而达到解释隐因子的目的。
具体做法:
(1)传统矩阵分解得到U和V
(2)构建物品content-based 的特征矩阵A
(3)学习V和A之间的映射函数V_if=F_f(A_i)(注意,是每个特征都有一个映射函数),ERROR = sum_f(V_if - F_f(A_i))^2 , f表示隐因子,i表示物品
(4)用过映射函数F_f和U就可以进行推荐(推荐的时候没有用到V,所以是shadow model)
(5)这里有几种错误量比较:shadow model预测ratings和原来模型ratings预测值直接的误差; F_f预测出的V同原来的V直接的误差。
关于QII方法确定特征的重要性:
从实验的角度也可以给出某个特征对于最终预测的重要性,具体做法是:固定其他特征,随机(根据边缘分布?)采样某一特征值,然后计算新特征向量下同原特征向量下的结果差的期望,就可以特征该特征的重要性。
不过这种方法的缺陷是,假设了每个特征之间的独立性。
另:
本文采用了仿真实验研究特征的重要性(可以借鉴)。
文献题目 | 去谷歌学术搜索 | ||||||||||
Latent Factor Interpretations for Collaborative Filtering | |||||||||||
文献作者 | Anupam Datta, Sophia Kovaleva, Piotr Mardziel, Shayak Sen | ||||||||||
文献发表年限 | 2018 | ||||||||||
文献关键字 | |||||||||||
隐因子解释; interpretation; shadow model; Quantitative input influence; QII; Interpreting; 仿真实验; simulation; mapping | |||||||||||
摘要描述 | |||||||||||
Many machine learning systems utilize latent factors as internal representations for making predictions. Since these latent factors are largely uninterpreted, however, predictions made using them are opaque. Collaborative filtering via matrix factorization is a prime example of such an algorithm that uses uninterpreted latent features, and yet has seen widespread adoption for many recommendation tasks. We present Latent Factor Interpretation (LFI), a method for interpreting models by leveraging interpretations of latent factors in terms of human- understandable features. The interpretation of latent factors can then replace the uninterpreted latent factors, resulting in a new model that expresses predictions in terms of interpretable features. This new model can then be interpreted using recently developed model explanation techniques. In this paper, we develop LFI for collaborative filtering based recommender systems. We illustrate the use of LFI interpretations on the MovieLens dataset, integrating auxiliary features from IMDB and DB tropes, and show that latent factors can be predicted with sufficient accuracy for replicating the predictions of the true model. |