Blog posts on Data Science, Machine Learning, Data Mining, Artificial Intelligence, Spark Machine Learning

Sunday, May 25, 2014

Basic recommendation engine using R

In our day to day life, we come across a large number of Recommendation engines like Facebook Recommendation Engine for Friends’ suggestions, and suggestions of similar Like Pages, Youtube recommendation engine suggesting videos similar to our previous searches/preferences. In today’s blog post I will explain how to build a basic recommender System.


Types of Collaborative Filtering:

  1. User based Collaborative Filtering
  2. Item based Collaborative filtering
 In this post will explain about User based Collaborative Filtering. This algorithm usually works by searching a large group of people and finding a smaller set with tastes similar to yours. It looks at other things they like and combines them to create a ranked list of suggestions.

Implementing User Based Collaborative Filtering:
This involves two steps:
  1. Calculating Similarity Function 
  2. Recommend items to users based on user Similarity Score
Consider the below data sample of Movie critics and their movie rankings, the objective is to recommend the unrated movies based on similar users:

Step1- Calculate Similarity Score for CHAN:

Creating Similarity score for people helps us to identify similar people. We use Cosine based Similarity function to calculate the similarity between the users. Know more about cosine similarity here. In R we have a cosine function readily available:
user_sim = cosine(as.matrix(t(x)))

Step2- recommending Movies for CHAN:

For recommending movies for Chan using the above similarity matrix, we need to first fill the N/A where he has not rated. As first step, separate the non-rated movies by Chan and a weighted matrix is created by multiplying user similarity score (user_sim[,7]) with ratings given by other users.
Next step is to sum up all the columns of the weight matrix, then divide by the sum of all the similarities for critics that reviewed that movie. The result calculation gives what the user might rate this movie, the results as below:
The above explanation is written in the below R function:
rec_itm_for_user = function(userNo) 
{ #calcualte column wise sum 
col_sums= list()
 rat_user = critics[userNo,2:7]
 x=1 
tot = list()
 z=1
 for(i in 1:ncol(rat_user)){ 
 if(is.na(rat_user[1,i])) 
 { 
 col_sums[x] = sum(weight_mat[,i],na.rm=TRUE)
 x=x+1
 temp = as.data.frame(weight_mat[,i])
 sum_temp=0
 for(j in 1:nrow(temp))
{ if(!is.na(temp[j,1]))
{
 sum_temp = sum_temp+user_sim[j,7]
 }
 } 
 tot[z] = sum_temp z=z+1 
 }
 }
 z=NULL
 z=1
 for(i in 1:ncol(rat_user)){ 
 if(is.na(rat_user[1,i]))
 {
 rat_user[1,i] = col_sums[[z]]/tot[[z]] z=z+1 
 }
 } 
return(rat_user)
 }
Calling the above function gives the below results:

rec_itm_for_user(7)
Titanic Batman Inception Superman.Returns spiderMan Matrix

2.811   4.5     2.355783           4            1    3.481427
Recommending movies for Chan will be in the order: Matrix (3.48), Titanic(2.81), Inception(2.35).
complete sourceCode is available on github

22 comments:

  1. If you are facing any issue in your Epson Printer like Epson Error Code 0xF3 then you can resolve this issue by the help of expert’s technicians of Epson Printer or you may dial their toll-free i.e 1-888-500-9609.

    ReplyDelete
  2. Attend The Data Science Training in Bangalore From ExcelR. Practical Data Science Training in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses in Bangalore.
    ExcelR business analytics courses

    ReplyDelete
  3. Thanks for sharing this useful information! Hope that you will continue with the kind of stuff you are doing. If you want you can also read this blog which talk about how to Fix Windows 7 Update Error 8007000e in a very easy way.

    ReplyDelete
  4. Great article like this require readers to think as they read. I took my time when going through the points made in this article. I agree with much this information.
    Best Data Science training in Mumbai

    Data Science training in Mumbai

    ReplyDelete
  5. I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
    360digitmg data science institutes

    ReplyDelete
  6. The content is well recognized, so no one could claim that it is just one person's opinion yet it covers and justifies all the valid points. Hope to read some more work from you.
    SAP training in Kolkata
    SAP training Kolkata
    Best SAP training in Kolkata
    SAP course in Kolkata

    ReplyDelete

  7. What a useful information provided by this blog! It’s remarkable. Thanks for helping me out… I will definitely get in touch with your blogs,
    Epson printer in error state



    ReplyDelete
  8. Attend The Business Analytics Courses From ExcelR. Practical Business Analytics Courses Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Courses.
    Business Analytics Courses

    ReplyDelete
  9. I think this is a standout amongst the most critical data for me. What"s more, i"m happy perusing your article. Be that as it may, ought to comment on some broad things ExcelR Business Analytics Courses

    ReplyDelete
  10. ExcelR provides Business Analytics Courses. It is a great platform for those who want to learn and become a Business Analytics. Students are tutored by professionals who have a degree in a particular topic. It is a great opportunity to learn and grow.

    Business Analytics Courses

    ReplyDelete
  11. I want to say thanks to you. I have bookmark your site for future updates. ExcelR Data Analyst Course

    ReplyDelete
  12. If it's not too much trouble share more like that. ExcelR Business Analytics Courses

    ReplyDelete
  13. Hey! Someone in my Facebook group shared this site with us so I came to look it over. I'm definitely enjoying the information. I'm bookmarking and will be tweeting this to my followers! Wonderful blog and amazing style and design. internet security

    ReplyDelete
  14. I've been surfing online more than three hours today, yet I never found any interesting article like yours. It is pretty worth enough for me. In my view, if all website owners and bloggers made good content as you did, the internet will be much more useful than ever before. energy shot supplement

    ReplyDelete
  15. Great Article .Its beneficial for me Thanks for sharing .if looking you accounting service then check it

    quickbooks customer service phone number

    ReplyDelete
  16. This is so amazing article thanks for sharing this type useful wonderful information please share in future again and again .if you are looking accounting service you can get by this at

    QuickBooks customer service

    ReplyDelete
  17. Most packaging boxes can look like an overwhelming mess to the uninitiated. You may be wondering, how a 2d drawing with a bunch of dotted lines is going to turn into a 3d box. Well, there are free options to get you the right box for your unique needs.

    ReplyDelete
  18. Fantastic!! you are doing good job! I impressed. Many bodies are follow to you and try to some new.. After read your comments I feel; Its very interesting and every guys sahre with you own works. Great!!
    ve may bay tu Phap ve Viet Nam

    vé máy bay từ singapore về việt nam

    vé máy bay từ úc về việt nam

    ve may bay tu Han Quoc ve Viet Nam

    ve may bay tu Nhat Ban ve Viet Nam

    ve may bay tu My ve Viet Nam

    ve may bay tu Duc ve Viet Nam

    ReplyDelete
  19. Hey, great blog, but I don’t understand how to add your site in my rss reader. Can you Help me please?
    data scientist training and placement

    ReplyDelete