Icitsi 2014 7048228
Icitsi 2014 7048228
Icitsi 2014 7048228
Abstract — Collaborative filtering method was widely used in Based on [6], matrix factorization can be used to increase
the recommendation system. This method was able to provide the prediction accuracy in scalable dataset. Other than, matrix
recommendations to the user through the similarity values factorization also can be used to remove the dimension of the
between users. However, the central issues in this method were item space and retrieve latent relations between items of the
new user issue and sparsity. This paper discusses about how to
dataset.
use matrix factorization and nearest-neighbour in film
recommendation systems. Both of methods will be used in order For experiments purpose, Movielens Hetrec 2011 rating
to make more accurate recommendations. Based on the data will be used in this paper. The rating data will be used for
experiments results, the combination of matrix factorization and computation in recommendation and prediction algorithm.The
classical collaborative filtering (nearest neighbor) could improve combination of matrix factorization and classic collaborative
the prediction accuracy. It can be concluded that the combination filtering (nearest-neighbour) technique will be used to improve
of matrix factorization and nearest-neighbor produced a better the prediction accuracy.
prediction accuracy Preliminary section 3 discusses how the dataset will be
used. In section 4, we describe the mathematical approach of
Keywords— Recommendation Systems; Matrix Factorization;
Collaborative Filtering.
our recommender. Prediction accuracy was evaluated in
section 5. Finally, the conclusions in section 6.
I. INTRODUCTION (HEADING 1)
II. RELATED WORK
Recommendation systems has been rapidly growing due to
huge data informations available. We need a personalized In this section we briefly review the main works in the
systems that can be match to us, based on what we read context. The list of references is not exhaustive due to the
recently, our last activities, and how its relevance to us. page limit. Our main focus was in collaborative filtering
Recommendation systems become more popular today and has which has been successfully applied to several real world
been used in various fields. problems, such as Netflix’s movie recommendation.
Collaborative filtering has been the mostly used in A. Content-based Filtering
recommendation systems nowaday. This method was able to Content-based method is one of the oldest methods and
give recommendation to user based on their similarity to most popular in the recommendation. The principle of this
others. On the other hand, matrix factorization has also been method is recommend object that has similarities to some
familiar since Netflix Grand Prize [1]. other object, the user preferred in the past [2]. The similarity
Generally, recommendation system can be divide into two between objects was determined from the values of the
parts, the Content-based Filtering and Collaborative Filtering. characteristics of the object.
Content-based filtering based on the similarity of the items to Fig. 1 shows an example of a content-based filtering. Well-
the object that the user liked in the past [2]. Meanwhile, known relationship marked by full arrows, calculations or
Collaborative filtering is the process of evaluation using the similarity of objects marked with an arrow point and
information of user behavior or the behavior of other users predictive relationship marked by dotted arrows.
[3].Collaborative filtering grouped by memory-based and
model-based.
Some of the central issues in the collaborative filtering is
sparsity, content analysis, overspecialization and new user
issue (cold start problem) [4]. Hybrid method was used in
order to solve this problems. Hybrid is a combination between
several methods. This method was also a key that brought
Bellkor team as the winner in Netflix Prize in 2009 [5].
III. METHODOLOGY
There were several things that need to be done before the
recommendation can be made. Firstly, it needsto extract the
dataset, in order to process recommendations. This section
will explain how the datawill be used, a hybrid
Fig. 2:Collaborative filtering [4] recommendation techniques and evaluation methods.
2
between matrix factorization and the the combination of
matrix factorization and nearest-neighbour (NNH). Combining
two or more technique was called hybrid [11]. The objective
of hybrid was to increase the recommendation accuracy and
reduce the error. Fig. 2. shows the architecture of hybrid.
3
IV. RECOMMENDER SYSTEM
In this section, the step-by-step of proposed method will be A rating prediction approximation will very close to the
more details. It will describe more about the recommendation original rating values, and it also deliver some predictions of
technique, data normalization, matrix factorization and its the unknown values or missing values. Neighbourhood
combination with classic collaborative filtering. algorithm uses the ratings of the similar users (or items) to
predict the values of the input matrix. The only difference with
A. Normalization plain SVD is the way how it computes the predictions. To
The data rating from MovieLens, it was a range between compute the prediction, it uses this equation:
rating. The range was from 1 until 5 (0, if no rating). The
normalizationwas used to remove the bias. One common σאௌ ೖ ሺǢ௨ሻ ܵ ݎ௨
method of normalization involves having values of each ݃݊݅ݐܽݎሺݑǡ ݅ሻ ൌ
σאௌ ೖ ሺǢ௨ሻ ܵ
feature range from 0 to 1.
In order to increase the prediction accuracy, the “global Where Sk (i; u) is the set of k that rate by u, which were most
effect” has to be removed. Generally the combination of user, similar to i. Sij is the similarity between i and j.
item and all average rating value (ݎҧ , ݎҧݎ& ݑҧ݅ ) were substracted
from the original data ruito remove the popularity effect [13].
D. Nearest Neighbour
ݎ ݅ݑൌ ݅ݑݎെ ߙݎҧ െ ߚݎҧ ݑെ ߛݎҧ ݅ Another method that will be used in this paper was the
nearest neigbour approach. Nearest neighbor was a classical
collaborative filtering technique that still used in
After normalize the data, then perform matrix factorization
recommendation systems. This technique will predict the
to get predictions and recommendations.
value and compare it with matrix factorization. Performed by
B. Matrix Factorization this technique was to find the similarity to the films or users.
In many cases, matrix factorization was used to remove the From the experiments carried out in section 5.2 proves that the
dimension of the item space and retrieve latent relations combination of matrix factorization and nearest neighbor can
between items of the dataset [12], [14]. Based on the improve the prediction accuracy for better recommendation
demonstration on Neflix Grand Prize, matrix factorization system.
model was the best nearest-neighbour technique in V. EXPERIMENTS
recommendation system.
Recommendation systems have a large enough data and In this section we carried out tests on datasets that have
very sparse, this causes a slow process and computational been mentioned above. Tests performed several times on the
predictions inaccurate. So therefore, it need special technique value of k to determine the accuracy of the results of the
in order to minimize the spread of data and speed up the methods that have been mentioned in section IV.
computation process, it can be done by matrix factorization A. MF Experiment
[12]. There were many methods used to process factorize the
The most difficult thing to get better accuracy was to
matrix, such as Singular Value Decomposition (SVD), Non-
determine the optimal k values as paramater in computation.
negative Matrix Factorization (NMF), and others. SVD
Therefore, multiple test by iterating the k valueswere
approach will be used in this paper, it factorize matrix rating
performed.
to be three low dimensional space, left-singular vector (V),
As ilustrated in Fig. 4, the prediction accuracy was strongly
singular value (S) and right-singular vector (W).
depend on the k values.
்
ܪ ൌ ܸ௫ Ǥ ܵ௫ Ǥ ܹ௫
0.85
Multiple k examples were used to make the experiment and 0.8
investigate the result for next experiment. 0.75
Results
0.7
C. Predictions 0.65
0.6
The comparisonof prediction value and original rating 0.55
were performed in order to get the evaluation about how 0.5
accurate the algorithm works. To predict rating rui, SVD class 5 10 15 20 25 30 35 40 45 50
reconstructs the original matrix
K
ܯǯ ൌ ܷσ ܸ ்
MAE RMSE
And the rating prediction equals to:
4
From Fig. 6.it shows that matrix factorization can result in factorization method, the smallest RMSE value was 0.789649,
an average prediction accuracy well.From several tested k, which k = 35. While for MF + CF method, the smallest RMSE
obtained the best of RMSE and MAE, respectively were 0,78 value was 0.788649 k = 30.
and 0,59 with k values were 20 and 35. Results metrics that
close to 0 were the most good. VI. CONCLUSIONS
The combination of MF and CF can indeed be used to
B. MF + CF Experiment make predictions and recommendation systems. But it doest
not have a huge if we compare it with the MF only. In
After getting the results of experiments with matrix experiment we have done, the difference were only 0.001.
factorization method, the next experiments will be performed Of course, there were many other parameters that can
by using the combination of matrix factorization and
determine the accuracy of predictions, such as the number of
collaborative filtering.
datasets, content features, normalization, and others. With the
It needs neighbourhood algorithm, which use the similarity
existance of this paper we hope it can enrich the knowledge in
values of user ratings to predict the value of the input matrix.
the field of recommendation systems and can be used in the
The prediction accuracy resulting from experiment using MF future for further experiments to improve the prediction
+ CF based on the value of k ilustrated in the Fig. 5. accuracy.
0.85 ACKNOWLEDGMENT
0.8 For the finishing of this experiment, thank to Mr. Bens
0.75
Results
5
[8] Greg Linden, Brent Smith, and Jeremy York, "Matrix Factorization Techniques for Recommender
"Amazon.com Recommendation: Item-To-Item Systems.," in IEEE Computer Society., 2009, pp. 42-49.
Collaborative Filtering," 2003. [13] Upendra Shardanand and Pattie Maes, "Social
[9] Paolo Cremonesi, Roberto Turrin, and Fabio Airoldi, information filtering: algorithms for automating “word
"Hybrid algorithms for recommending new items," in of mouth”," in Proceedings of the SIGCHI Conference
Proceedings of the 2nd International Workshop on on Human Factors in Computing Systems, New York,
Information Heterogeneity and Fusion in 1995, pp. 210-217.
Recommender Systems, New York, 2011, pp. 33-40. [14] Xiaoyuan Su and Taghi M Khoshgoftaar, "A Survey of
[10] G Takács, I Pilászy, B Németh, and Domonkos Tikk, Collaborative Filtering Techniques," Journal Advances
"Investigation of Various Matrix Factorization Methods in Artificial Intelligence, pp. 1-19, 2009.
for Large Recommender Systems," in IEEE
International Conference on Data Mining Workshops, .
Pisa, 2008., pp. 553-562.
[11] Robin Burke, "Hybrid Web Recommender Systems," in
Vol. 4321 of Lecture Notes in Computer Science. Berlin
Heidelberg: Springer-Verlag, 2007, p. 377.
[12] Yehuda Koren, Robert Bell, and Chris Volinsky,