本文的核心思想:利用谱聚类方法(Spectral clustering),给用户聚类(user community detection),然后利用聚类后同类别的用户以及用户自己的历史信息,给用户进行推荐。
部分细节:
(1) individual + group knowledge (线性加权)
(2)Spectral clustering可以近似利用最优化学习策略学出特征向量(ref: 谱聚类),正是因为谱聚类算法的学习特性,所以可以改造优化目标,融入更多可学习或者约束学习的信息。
(3)基于(2)中的性质以及根据Grassman Manifold measure计算出的domain之间的距离(distance),就可以把求用户聚类这件事情放到几个domain之间来做(cross-domain)
(4)再基于(3)中学出的结果,融合或者调整domain内部user community之间和外部之间的关系。
主要参考或值得借鉴再学习的地方:
(1) Grassman Manifold 计算domain之间的相关性
(2) 基于学习框架的谱聚类算法
Spectral Clustering(谱聚类)是一种基于图论的聚类方法,它能够识别任意形状的样本空间且收敛于全局最优解,其基本思想是利用样本数据的相似矩阵进行特征分解后得到的特征向量进行聚类,可见,它与样本feature无关而只与样本个数有关。
文献题目 | 去谷歌学术搜索 | ||||||||||
Cross-Domain Recommendation via Clustering on Multi-Layer Graphs | |||||||||||
文献作者 | Aleksandr Farseev*, Ivan Samborskii** *, Andrey Filchenkov**, Tat-Seng Chua* | ||||||||||
文献发表年限 | 2017 | ||||||||||
文献关键字 | |||||||||||
Grassmannn manifold; group knowledge; Spectral clustering; 谱聚类;SIGIR | |||||||||||
摘要描述 | |||||||||||
Venue category recommendation is an essential application for the tourism and advertisement industries, wherein it may sug- gest attractive localities within close proximity to users’ current location. Considering that many adults use more than three so- cial networks simultaneously, it is reasonable to leverage on this rapidly growing multi-source social media data to boost venue rec- ommendation performance. Another approach to achieve higher recommendation results is to utilize group knowledge, which is able to diversify recommendation output. Taking into account these two aspects, we introduce a novel cross-network collaborative rec- ommendation framework C 3 R, which utilizes both individual and group knowledge, while being trained on data from multiple social media sources. Group knowledge is derived based on new cross- source user community detection approach, which utilizes both inter-source relationship and the ability of sources to complement each other. To fully utilize multi-source multi-view data, we pro- cess user-generated content by employing state-of-the-art text, im- age, and location processing techniques. Our experimental results demonstrate the superiority of our multi-source framework over state-of-the-art baselines and different data source combinations. In addition, we suggest a new approach for automatic construc- tion of inter-network relationship graph based on the data, which eliminates the necessity of having pre-defined domain knowledge. |