Testing recommender systems in R

Recommender systems are pervasive. You have encountered them while buying a book on barnesandnoble, renting a movie on Netflix, listening to music on Pandora, to finding the bar visit (FourSquare). Saar for Revolution Analytics, had demonstrated how to get started with some techniques for R here.

We will build some using Michael Hahsler’s excellent package – recommenderlab. But to build something we have to learn to recognize when it is good. For this reason we will talk about some metrics quickly –

– RMSE  (Root Mean Squared Error) : Here we measure far were real ratings from the ones we predicted. Mathematically, we can write it out as

RMSE = \sqrt\frac{\sum_{(i,j) \in \kappa}(r_{(i,j)} - \hat {r}_{(i,j)})^2}{|\kappa|}

where \kappa is the set of all user-item pairings (i, j) for which we have a predicted rating \hat r_{(i,j)}  and a known rating r_{(i,j)}  which was not used to learn the recommendation model.

Here at sane.a.lytics, I will talk about when an analysis makes sense and when it doesn’t. RMSE is a great metric if you are measuring how good your predicted ratings are. But if you want to know how many people clicked on your recommendation, I have a different metric for you.

– Precision/Recall/f-value/AUC: Precision tells us how good the predictions are. In other words, how many were a hit.

Recall tells us how many of the hits were accounted for, or the coverage of the desirable outcome.

Precision and recall usually have an inverse relationship. This becomes an even bigger issue for rare issue phenomenon like recommendations. To tackle this problem, we will use f-value. This is nothing but the harmonic mean of precision and recall.

Another popular measure is AUC. This is roughly analogous. We will go ahead and use this for now for our comparisons of recommendation effectiveness.

– ARHR (Hit Rate): Karypis likes this metric.

ARHR = \frac{1}{\#users} \sum_{i=1}^{\#hits} \frac{1}{p_i}

where p is the position of the item in a ranked list.

OK, on to the fun stuff.

They are a few different ways to build a recommender system

Collaborative Filtering : If my friend Jimmy tells me that he liked the movie “Drive”, I might like it too since we have similar tastes. However if Paula tells me she liked “The Notebook”, I might avoid it. This is called UBCF (User-based collaborative filtering). Another way to think about it is that this is soft-clustering. We find Users with similar tastes (neighbourhood) and use their preferences to build yours.

Another flavour of this is IBCF (Item Based Collaborative Filtering). If I watched “Darjeeling Limited”, I might be inclined to watch “The Royal Tannenbaums” but not necessarily “Die Hard”. This is because the first two are more similar in the users who have watched/rated them. This is a rather simple to compute as all we need is the covariance between products to find out what this might be.

Let’s compare both approaches on some real data (thanks R)

# Load required library
library(recommenderlab) # package being evaluated
library(ggplot2) # For plots
# Load the data we are going to work with
# 943 x 1664 rating matrix of class ‘realRatingMatrix’ with 99392 ratings.
# Visualizing a sample of this
image(sample(MovieLense, 500), main = "Raw ratings")

Distribution of ratings sample

# Visualizing ratings
qplot(getRatings(MovieLense), binwidth = 1,
main = "Histogram of ratings", xlab = "Rating")
summary(getRatings(MovieLense)) # Skewed to the right
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.00 3.00 4.00 3.53 4.00 5.00

Histogram of ratings

# How about after normalization?
qplot(getRatings(normalize(MovieLense, method = "Z-score")),
main = "Histogram of normalized ratings", xlab = "Rating")
summary(getRatings(normalize(MovieLense, method = "Z-score"))) # seems better
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# -4.8520 -0.6466 0.1084 0.0000 0.7506 4.1280

Rating histogram, normalized

# How many movies did people rate on average
qplot(rowCounts(MovieLense), binwidth = 10,
main = "Movies Rated on average",
xlab = "# of users",
ylab = "# of movies rated")
# Seems people get tired of rating movies at a logarithmic pace. But most rate some.

Movies rated

# What is the mean rating of each movie
qplot(colMeans(MovieLense), binwidth = .1,
main = "Mean rating of Movies",
xlab = "Rating",
ylab = "# of movies")
# The big spike on 1 suggests that this could also be intepreted as binary
# In other words, some people don't want to see certain movies at all.
# Same on 5 and on 3.
# We will give it the binary treatment later

avg movie rating

recommenderRegistry$get_entries(dataType = "realRatingMatrix")
# We have a few options
# Let's check some algorithms against each other
scheme <- evaluationScheme(MovieLense, method = "split", train = .9,
k = 1, given = 10, goodRating = 4)
algorithms <- list(
"random items" = list(name="RANDOM", param=list(normalize = "Z-score")),
"popular items" = list(name="POPULAR", param=list(normalize = "Z-score")),
"user-based CF" = list(name="UBCF", param=list(normalize = "Z-score",
nn=50, minRating=3)),
"item-based CF" = list(name="IBCF2", param=list(normalize = "Z-score"
# run algorithms, predict next n movies
results <- evaluate(scheme, algorithms, n=c(1, 3, 5, 10, 15, 20))
# Draw ROC curve
plot(results, annotate = 1:4, legend="topleft")
# See precision / recall
plot(results, "prec/rec", annotate=3)



It seems like UBCF did better than IBCF. Then why would you use IBCF? The answer lies is when and how are you generating recommendations. UBCF saves the whole matrix and then generates the recommendation at predict by finding the closest user. IBCF saves only k closest items in the matrix and doesn’t have to save everything. It is pre-calculated and predict simply reads off the closest items.

Predictably, RANDOM is the worst but perhaps surprisingly it seems, its hard to beat POPULAR. I guess we are not so different, you and I.

In the next post I will go over some other algorithms that are out there and how to use them in R. I would also recommend reading Michael’s documentation on recommenderlab for more details.

Also added this to r-bloggers. Please check it out for more R goodies.

Testing recommender systems in R

9 thoughts on “Testing recommender systems in R

  1. thai says:

    results <- evaluate(scheme, algorithms, n=c(1, 3, 5, 10, 15, 20))
    RANDOM run
    1 [0sec/1.64sec] POPULAR run
    1 [0.18sec/0.36sec] UBCF run
    1 [0.16sec/5.99sec] IBCF2 run
    1 Timing stopped at: 0 0 0
    Error in .local(data, …) :
    Recommender method IBCF2 not implemented for data type realRatingMatrix .

    Please help me fix this error. Thanks!

  2. Sorry about the delayed response. I have submitted to recommenderlab codebase and IBCF’s bug has been fixed. Please go ahead and use IBCF instead of IBCF2.

  3. Christos says:

    Hi and thanks for your code contribution. I am trying to get the recommenderlabbrats but I get an error:

    Downloading GitHub repo sanealytics/recommenderlabrats@master
    Installing recommenderlabrats
    ‘/Library/Frameworks/R.framework/Resources/bin/R’ –no-site-file –no-environ –no-save –no-restore \
    ‘/private/var/folders/57/xggd_p_s1g70kf37yp67b6t80000gn/T/Rtmp8bTtut/devtools1058c7c56f105/sanealytics-recommenderlabrats-2d2b52c’ \
    –library=’/Library/Frameworks/R.framework/Versions/3.2/Resources/library’ –install-tests

    * installing *source* package ‘recommenderlabrats’ …
    ** libs
    clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I”/Library/Frameworks/R.framework/Versions/3.2/Resources/library/Rcpp/include” -I”/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RcppArmadillo/include” -fPIC -Wall -mtune=core2 -g -O2 -c RcppExports.cpp -o RcppExports.o
    clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I”/Library/Frameworks/R.framework/Versions/3.2/Resources/library/Rcpp/include” -I”/Library/Frameworks/R.framework/Versions/3.2/Resources/library/RcppArmadillo/include” -fPIC -Wall -mtune=core2 -g -O2 -c als.cpp -o als.o
    als.cpp:2:10: fatal error: ‘omp.h’ file not found
    1 error generated.
    make: *** [als.o] Error 1
    ERROR: compilation failed for package ‘recommenderlabrats’
    * removing ‘/Library/Frameworks/R.framework/Versions/3.2/Resources/library/recommenderlabrats’
    Error: Command failed (1)

    Do you have any ideas on that?



    1. Thank you for using the package.

      This error happened because I had a hard dependancy on OpenMP BLAS. I just changed it so that you can install it on a single core machine too.
      Sorry I couldn’t test it because my machine has OpenMP. Can you let me know if it works for you now?

  4. Filip says:

    hey, could you give me a hint what the parameter given exactly does. as I see, with user less item ratings than given the evaluation fails

    secondly… the UBCF and IBCF seems to be extremely slow

    my data set is like 50000 x 3000 big

  5. Subhasree Chatterje says:

    My Rstudio hangs whenever I try to run IBRS. Has anyone else faced similar problem? Is there a way to get rid of this?

Leave a Reply to Subhasree Chatterje Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s