12 December 2022 / RECOMMENDER-SYSTEMS

Recommendation Systems Walkthrough - Popularity Recommendations

This post discusses a different approach to recommending movies based on the movie’s popularity.

In this post, we will use the average rate available in the movies database. This approach is for building a more generalized recommendation widget based on the movie’s popularity.

In fact, this widget will be used to recommend movies to all users, not for personalized recommendations. The basic concept is that movies with high popularity will most likely have a high probability of being liked by the average audience.

The approach here is pretty simple we sort the movies based on the pre-calculated average rate which is collected from different users.

For the sake of demonstration, IMDB has a list of top-rated movies which is rated by different users. How is IMDb actually calculating these rates?

The formula for calculating the top rated 250 titles gives a true Bayesian estimate:

weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C where:

R = average for the movie (mean) = (Rating)
v = number of votes for the movie = (votes)
m = minimum votes required to be listed in the Top 250 (currently 25000)
C = the mean vote across the whole report (currently 7.0)

def compute_weighted_average_rate(r, v , m, c):
    wr = (v / (v + m) * r) + (m / (v + m) * c)
    return wr

Based on the previous criteria we can use it as a suggestion widget to suggest movies to all users given the IMDB popularity info or users’ average rate that will be recomputed from time to time.

Of course, it is not the best to give a recommendation, but I think it can be used along with other recommender algorithms.

Assume for a moment we have the following table in our movies database which is fetched using this sql query :

SELECT name, rate, COALESCE(popularity, 0) AS popularity 
  FROM movies_app_movie 
 ORDER BY popularity DESC;

Name	Rate	Popularity
Minions	6.4	547.4882980000001
Wonder Woman	7.2	294.337037
Beauty and the Beast	6.8	287.253654
Baby Driver	7.2	228.032744
Big Hero 6	7.8	213.84990699999997
Deadpool	7.4	187.860492
Guardians of the Galaxy Vol. 2	7.6	185.33099199999998
Avatar	7.2	185.070892
John Wick	7.0	183.870374
Gone Girl	7.9	154.80100900000002
The Hunger Games: Mockingjay - Part 1	6.6	147.098006
War for the Planet of the Apes	6.7	146.161786
Captain America: Civil War	7.1	145.882135
Pulp Fiction	8.3	140.95023600000002
Pirates of the Caribbean: Dead Men Tell No Tales	6.6	133.82782
The Dark Knight	8.3	123.167259
Blade Runner	7.9	96.272374
The Avengers	7.4	89.887648
Captain Underpants: The First Epic Movie	6.5	88.561239
The Circle	5.4	88.439243

Based on the previous table, we deduce that movies are sorted in descending order according to their popularity. We will go through how we can compute such info for every film in our database.

In order to compute the popularity property it is a bit tricky since we may need some of the following information which of course will be stored in a separate table to JOIN on later.

Let’s take a look at a separate table for the ratings of movies and it is required to do some aggregation to calculate the rating count of every movie.

  SELECT movie_id, 
         COUNT(movie_id), 
         AVG(rate) AS rate_avg 
    FROM movies_app_rating 
GROUP BY movie_id 
ORDER BY rate_avg DESC
   LIMIT 20;

Movie ID	Name	Count	Rate Avg
2284	Mr. Magorium’s Wonder Emporium	1	5
4459	Night Without Sleep	1	5
5473	De Dominee	1	5
2636	The Specialist	1	5
36931	On the Edge	1	5
64278	Interceptor Force 2	1	5
183	The Wizard	1	5
845	Strangers on a Train	1	5
26791	Brigham City	1	5
43267	29th Street	1	5
31413	Innocence	1	5
4201	The Fifth Musketeer	1	5
2984	A Countess from Hong Kong	1	5
4140	Blindsight	1	5
6107	Murder in Three Acts	1	5
1563	Sunless	1	5
65216	Bloody Cartoons	1	5
1933	The Others	1	5
8675	Orgazmo	1	5
2897	Around the World in Eighty Days	1	5

There are many ways to determine the popularity of a movie. There is no standard way of computing such a score, we can take the following factors into consideration for example:

Number of votes for the day.
Number of views for the day.
Number of users who marked it as a “favorite” for the day.
Number of users who added it to their “watchlist” for the day.
Number of comments.
Number of rates (Negative Vs. Positive).
Number of total votes.

Recommendation Systems Walkthrough - Popularity Recommendations

Learning Owl Framework - Introduction

Arabic Translation System: Seq2Seq Statistical learning