A Data Science Central Community

*Originally posted here. *

In our day to day life, we come across a large number of Recommendation engines like Facebook Recommendation Engine for Friends’ suggestions, and suggestions of similar Like Pages, Youtube recommendation engine suggesting videos similar to our previous searches/preferences. In today’s blog post I will explain how to build a basic recommender System.

- User based Collaborative Filtering
- Item based Collaborative filtering

In this post will explain about User based Collaborative Filtering. This algorithm usually works by searching a large group of people and finding a smaller set with tastes similar to yours. It looks at other things they like and combines them to create a ranked list of suggestions.

This involves two steps:

- Calculating Similarity Function
- Recommend items to users based on user Similarity Score

Consider the below data sample of Movie critics and their movie rankings, the objective is to recommend the unrated movies based on similar users:

#### Step1- Calculate Similarity Score for CHAN:

Creating Similarity score for people helps us to identify similar people. We use Cosine based Similarity function to calculate the similarity between the users. Know more about cosine similarity here. In R we have a cosine function readily available:

user_sim = cosine(as.matrix(t(x)))

For recommending movies for Chan using the above similarity matrix, we need to first fill the N/A where he has not rated. As first step, separate the non-rated movies by Chan and a weighted matrix is created by multiplying user similarity score (user_sim[,7]) with ratings given by other users.

Next step is to sum up all the columns of the weight matrix, then divide by the sum of all the similarities for critics that reviewed that movie. The result calculation gives what the user might rate this movie, the results as below:

The above explanation is written in the below R function:

rec_itm_for_user = function(userNo)

{ #calcualte column wise sum

col_sums= list()

rat_user = critics[userNo,2:7]

x=1

tot = list()

z=1

for(i in 1:ncol(rat_user)){

if(is.na(rat_user[1,i]))

{

col_sums[x] = sum(weight_mat[,i],na.rm=TRUE)

x=x+1

temp = as.data.frame(weight_mat[,i])

sum_temp=0

for(j in 1:nrow(temp))

{ if(!is.na(temp[j,1]))

{

sum_temp = sum_temp+user_sim[j,7]

}

}

tot[z] = sum_temp z=z+1

}

}

z=NULL

z=1

for(i in 1:ncol(rat_user)){

if(is.na(rat_user[1,i]))

{

rat_user[1,i] = col_sums[[z]]/tot[[z]] z=z+1

}

}

return(rat_user)

}

Calling the above function gives the below results:

rec_itm_for_user(7)

Titanic Batman Inception Superman.Returns spiderMan Matrix

2.811 4.5 2.355783 4 1 3.481427

Recommending movies for Chan will be in the order: Matrix (3.48), Titanic(2.81), Inception(2.35).

© 2020 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of BigDataNews to add comments!

Join BigDataNews