第十讲-Recommender Systems
第十讲-Recommender Systems
程 健 研究员
jcheng@nlpr.ia.ac.cn
中国科学院自动化研究所
模式识别国家重点实验室
1
Index
➢ What is recommender system?
➢ Traditional Methods
➢ RS Systems
➢ Conclusion
2
Index
➢ What is recommender system?
➢ Traditional Methods
➢ RS Systems
➢ Conclusion
3
Everything is recommendation
Everything is personalized
京东商城
Everything is personalized
彼之砒霜 吾之蜜糖 6
Everything is personalized
• 视频网站
YouTube 土豆 Hulu 奇艺视频 等
• 电子商务网站
淘宝,亚马逊等
• 社交网站
Facebook 人人网 Twitter 微博 等
• ……
The Long Tail
⚫ Given:
➢ User model (e.g. ratings, preferences, demographics, situational context)
➢ Items (with or without description of item characteristics)
⚫ Calculate:
Rating Prediction
➢ Relevance score used for ranking
⚫ Target:
Top-N
➢ Rating Prediction & Top-N Recommendation
⚫ But:
➢ Remember that relevance might be context dependent
➢ Characteristics of the list itself might be important (diversity)
10
Performance Evaluation
⚫ Measures for rating prediction
11
Performance Evaluation
⚫ Measures for top-N recommendation
• NDCG(Normalized Discounted Cumulative Gain)
定义不唯一
Ideal DCG
• F1 Score
12
Index
➢ What is recommender system?
➢ Traditional Methods
➢ RS Systems
➢ Conclusion
13
A glance of Paradigms for RS
Personal Recommendation
Recommendation list
14
A glance of Paradigms for RS
Content-based: "Show me more of the
same what I've liked" Recommendation list
Item Attributes
15
A glance of Paradigms for RS
Collaborative: "Tell me what's popular Recommendation list
among my peers"
Recommender System
Community Data
16
A glance of Paradigms for RS
Hybrid: combinations of various inputs
Recommendation list
User profile &
contextual information Recommender System
Community Data
Item Attributes 17
Content-Based Recommendation
⚫ Recommendations based on content of items rather than on
other users’ opinions/interactions
⚫ Goal: recommend items similar to those the user liked
⚫ Common for recommending text-based products (web
pages, news messages)
⚫ Items to recommend are “described” by their associated
features (e.g. keywords)
⚫ User Model structured in a “similar” way as the content:
features/keywords more likely to occur in the preferred
documents (lazy approach)
⚫ The user model can be a classifier based on whatever
technique (Neural Networks, Naïve Bayes...)
18
Content-Based Recommendation
⚫ Content representation and item similarities
Express item features as:
➢ TF-IDF
➢ N-Gram
➢ LDA
➢ Word2Vec
19
Content-Based Recommendation
⚫ Pros:
➢ No need for data on other users: No cold-start or sparsity
➢ Able to recommend to users with unique tastes
➢ Able to recommend new and unpopular items
➢ Can provide explanations by listing content-features
⚫ Cons:
➢ Requires content that can be encoded as meaningful features
(difficult in some domains/catalogs)
➢ Users represented as learnable function of content features
➢ Difficult to implement serendipity
➢ Easy to overfit (e.g. for a user with few data points)
20
Collaborative Filtering
⚫ List of m Users and a list of n Items
22
Collaborative Filtering
⚫ memory-based CF
➢ User-based CF
➢ Item-based CF
⚫ model-based CF
➢ First develop a model of user
➢ Type of model:
1. Probabilistic (e.g. Bayesian Network)
2. Clustering
3. Rule-based approaches (e.g. Association
Rules)
4. Classification/Regression
5. … 23
User-based CF
The basic steps:
1. Identify set of ratings for the target/active user
2. Identify set of users most similar to the target/active user
according to a similarity function (neighborhood formation)
3. Identify the products these similar users liked
4. Generate a prediction
5. Based on this predicted rating recommend a set of top N
products
24
User-based CF
⚫ A collection of user and a collection of products
25
User-based CF Example
26
User-based CF Example
27
User-based CF Example
28
User-based CF Example
29
Item-based CF Example
The basic steps:
1. Look into the items the target user has rated
2. Compute how similar they are to the target item
3. Select k most similar items
4. Compute Prediction by taking weighted average on the
target user’s ratings on the most similar items
30
Item Similarity Computation
⚫ Similarity: find users who have rated items and apply a similarity
function to their ratings
⚫ Cosine-based Similarity (difference in rating scale between users
is not taken into account)
31
Item Similarity Computation
⚫ Alternative similarity metric
32
Model-based CF
Motivated by Netflix Prize (launched in Oct. 2006)
⚫ Task:
High quality recommendations
for cinematch (RMSE=0.9525)
⚫ Dataset:
users: 480,000
movies: 17,770
rates ratio <1%
33
Model-based CF
Motivated by Netflix Prize (launched in Oct. 2006)
⚫ Measure:
34
Model-based CF
35
Model-based CF
2009 Netflix Prize Results
⚫ Top 2 single algorithms:
SVD/MF - Prize RMSE: 0.8914
RBM - Prize RMSE: 0.8990
36
Matrix Factorization
⚫ Basic idea
item factor
5 3
4 1
5 1 𝑃 factor
𝑄𝑇
user
4 4 2
user factor matrix item factor matrix
5 3
4 1
5 1 𝑃 factor
𝑄𝑇
user
4 4 2
user factor matrix item factor matrix
5 3
4 1
5 1 𝑃 factor
𝑄𝑇
user
4 4 2
user factor matrix item factor matrix
5 3
4 1
5 1 𝑃 factor
𝑄𝑇
user
4 4 2
user factor matrix item factor matrix
40
Non-negative
Matrix Factorization
⚫ ‘Orthogonal NMF’ == ‘Kernel K-Means Clustering’
Orthogonal NMF
Proof
1. Kernel K-means clustering tries to minimize
41
Non-negative
Matrix Factorization
⚫ ‘Orthogonal NMF’ == ‘Kernel K-Means Clustering’
2. We write the NMF formulation as
Further transform to
42
SVD for Rating Prediction
However,
➢ Some items are significantly higher rated…
➢ Some users rate substantially lower…
➢ All Ratings are high…
Thus,
➢ Add item offset…
➢ Add user offset…
➢ Add global offset…
⚫ Baseline (bias) (user & item deviation from
average)
⚫ Predict rating as
43
SVD for Rating Prediction
⚫ In order to prevent over-fitted problem, we add some regularized
terms, such as:
Where
are three item factor vectors
items rated by user u
items for which the user has given implicit preference
Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009.
44
SVD for Rating Prediction
Further
studies
Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009.
45
Restricted Boltzmann Machines
⚫ Each unit is a state that can be active or not active
⚫ Each input to a unit is associated to a weight
⚫ The transfer function calculates a score for every unit
based on the weighted sum of inputs
⚫ Score is passed to the activation function that calculates
the probability of the unit to be active
One pass
46
RBM for CF
⚫ Each visible unit = an item
⚫ Num of hidden units is a parameter
⚫ In training phase, for each user:
⚫ In prediction phase:
47
Salakhutdinov R, Mnih A, Hinton G. Restricted Boltzmann machines for collaborative filtering. ICML, 2007
Probabilistic Matrix Factorization
⚫ From the view of probability to predict ratings, we assume
factorized vectors of users and items are in line with the
Gaussian distribution, user’s preference for items is a
combination of the probability of a series of problems, such as
where
48
Salakhutdinov R, Mnih A. Probabilistic matrix factorization. NIPS, 2008.
Probabilistic Matrix Factorization
⚫ By adding regularized terms, the formulation can be shown as:
Ning X, Karypis G. Slim: Sparse linear methods for top-n recommender systems. ICDM 2011
53
Clustering Based CF
⚫ Goal: cluster users and compute per-cluster “typical”
preferences
⚫ Users receive recommendations computed at the cluster
level
𝐶1 𝐶2
𝐶3 …
54
LSH for clustering
⚫ Method for grouping similar items in highly dimensional spaces
⚫ Find a hashing function s.t. similar items are grouped in the same
buckets
⚫ Main application is Nearest-neighbors
➢ Hashing function is found iteratively by concatenating random
hashing functions
➢ Addresses one of NN main concerns: performance
55
Classifiers for CF
⚫ Classifiers are general computational models trained
using positive and negative examples
⚫ They may take in inputs:
➢ Vector of item features (action / adventure)
➢ Preferences of customers (like action / adventure)…
➢ Relations among item
⚫ E.g. Logistic Regression, Bayesian Networks, Support Vector
Machines, Decision Trees, etc...
⚫ Pros:
➢ Versatile
➢ Can be combined with other methods to improve accuracy
⚫ Cons:
➢ Need a relevant training set
➢ May overfit
56
Limitations of CF
57
Hybrid Approaches
⚫ Content–based recommendation with Bayesian classifier
⚫ Collaborative is standard using Pearson correlation:
⚫ Collaboration via content uses the content-based user profiles
58
Hybridization Methods
59
Social Recommendations
⚫ A social recommender system recommends items that are
“popular” in the social proximity of the user
⚫ Social proximity = trust (can also be topic-specific)
⚫ Given two individuals - the source (node A) and sink (node C) -
derive how much the source should trust the sink.
⚫ Algorithm: Advogato, Appleseed, MoleTrust, TidalTrust
60
Social Recommendations
61
Social Recommendations
62
Social Recommendations
63
Social Recommendations
64
Factorization Machines
⚫ FM becomes as:
➢ Traditional Methods
➢ RS Systems
➢ Conclusion
69
Deep Learning based RS
MLP: easily model the nonlinear interactions between users and items;
CNN: are capable of extracting local and global representations from
heterogeneous data sources such as textual and visual information;
RNN: enable the recommender systems to model the temporal
dynamics and sequential evolution of content information
70
Deep Learning based Recommender System: A Survey and New Perspectives. ACM Computing Surveys 2019
MLP based RS
where
Video recommendation
➢ Traditional Methods
➢ RS Systems
➢ Conclusion
75
TOP 10 开源的推荐系统
⚫ SVDFeature http://svdfeature.apexlab.org/wiki/Main_Page
⚫ libMF http://www.csie.ntu.edu.tw/~cjlin/libmf/
⚫ libFM http://www.libfm.org/
⚫ Lenskit http://lenskit.grouplens.org/
⚫ GraphLab GraphLab - Collaborative Filtering
⚫ Mahout http://mahout.apache.org/
⚫ Myrrix http://myrrix.com/
⚫ EasyRec http://easyrec.org/
⚫ Waffles http://waffles.sourceforge.net/
⚫ RapidMiner http://rapidminer.com/
76
工业界的推荐系统
视频类:
Netflix:很多方法的融合
Hulu:主要是item based CF
Youtube:开始是random walk,后来改为类似item based CF的方法
图书类:
Amazon:好多方法都用了,主要是 item based CF
资讯类:
google news:用了CF和bayesian的方法。
digg:算法是 热门度+topic driven user based CF,
音乐类:
last.fm:用的是CF。
yahoo music:参考Koren的论文。
pandora:音乐基因项目,主要依赖专家标注。
社交类
facebook:算法叫Edgerank。
twitter:主要场景是推荐其它用户,参考官方介绍。
77
Widely used data
⚫ Movie
MovieLens http://grouplens.org/datasets/movielens/
Netflix https://www.netflix.com/cn/
⚫ Book
Amazon books http://www.amazon.com/b/ref=usbk_surl_books/?node=283155
Book-Crossing http://grouplens.org/datasets/book-crossing/
⚫ Music
Last.fm http://www.last.fm/
⚫ Food
Dianping http://www.dianping.com/
⚫ Else…
Epinion http://www.datatang.com/data/11849
78
Index
➢ What is recommender system?
➢ Traditional Methods
➢ RS Systems
➢ Conclusion
79
Challenging Problems
⚫ Data sparsity:
➢ Netflix Dataset: nearly 48,000 users and 1,700 items, only 1% observations
⚫ Curse of dimensionality
➢ Users’ features can be represented as many ways
⚫ Cold start:
➢ Many new users sign in and many new items are added
⚫ Personalization:
➢ Different user has different taste
80
Thanks!
Q&A
81