Movie Data Advent Calendar provides top film recommendations

Today it’s finally time to open the first door of advent calendars and start planning activities for approaching holidays! How about watching a few classic Christmas movies? Or, in honor of Independence Day, perhaps a high-quality Finnish film? When you need new ideas what to watch, check out our Movie Data Advent Calendar that provides top rated film recommendations for the days off. And no lurking! The advent calendar doors can only be opened on the actual day.



See the advent calendar on Tableau Public

How the Movie Data Calendar was created

I confess. I like data and I like movies. And in general I am unhappy with the algorithms that are recommending me content. Before a movie night, it's my habit to meticulously study IMDb reviews. I want my movies to be worth my time, with a minimum general rating of 7/10. Occasionally, I do make exceptions and watch a movie with a lower rating if it falls within a specific genre or is directed by a certain filmmaker.

As IMDb is a proprietary platform, access to data and search capabilities are limited in a free version. The database is extensive, and it practically includes most of the films released in Finland as well, at least the most popular ones. However, it is not possible to search the best films made in Finland, nor the most popular Christmas movies. The generic movie genre categorization simply does not include these classifications. Also, not so surprisingly, movies from a small language area usually don't make it to the overall top lists. There are some other rating services as well, but the same limitations apply, the data is not commonly open for investigation. In practice, recommendations from movie critics and journalists have been the best way to me discover good films beyond Hollywood block buster. Until now.

IMDb Data is Now Open

Luckily, Christmas came early this year for movie data enthusiasts. Tableau – company specialized to data visualization and business intelligence - has released a comprehensive set of movie data in collaboration with IMDb. The data set includes over 500 000 movies and over 5 million records, covering films made in years 1902-2022. The set is aggregated meaning individual reviews are not there, but some associated info like tagline, actors, directors and language are available. The data set, more info and plenty of cool movie dashboards implemented by Tableau community are available here:

How to find the best movies? Can you trust the IMDb ratings? The risk of potential data manipulation is always there in these kind of open voting services. IMDb has deliberately controlled the top lists it creates and states only votes from “regular” IMDb voters are considered when creating e.g. the top 250 out of the full voting database – and they deliberately do not reveal how “regular” is defined. Probably there is still some manipulation that can pass the preventing mechanisms, but in general I tend to agree on most ratings of the films I have watched. So let’s see what the IMDb data reveals.

Data Exploration

I used Tableau to search the data set and look for top 25 films in two categories, “Christmas movies” and “movies considered domestic in Finland”. Data exploration is one of the areas where the unique strengths of Tableau are: the tool is providing easy and intuitive means for data exploration and ad hoc analysis. I first searched best rated Christmas films. As there is no existing classification I made initial Christmas movies category by including movie titles and plots that contain words like Christmas, Santa and Elf. Also as I hoped the result set to be family friendly fiction, I excluded documentaries and horror genre. Threshold of votes got set to 5000.

Who is able to watch 25 Christmas films in row? Not me. To provide more watching options I looked for top rated films that are considered domestic in Finland. The task is a bit harder as only language classification exists, no country location – but I know some famous Finnish origin films are done e.g. in Sami, Swedish and English, so using Finnish as a search criteria is not sufficient. As all the details are not available in the data set, I decided to rely on information of the director and spoken language of the film. I looked for the persons that have directed at least one film in Finnish – and then I started finding the right kind of movies, including those done in other languages.

Basically all live data sets do have errors and inaccuracies (this is my lesson learned in eight years working with data) and this movie data set is not an exception. And more the data set is explored, more errors show: for example Filipino director Chito S. Roño does not have any films in Finnish even if the data claims so. As my interest here is to find fiction, I exclude faulty data, documentary genre and set the threshold of votes to 500 as general number of votes are not that high in Finnish films. To get greater variety to recommendations only one movie from one film director is included. In practice this rule prevents Aki Kaurismäki film overload. Also, if there are multiple filmizations of one story only the most popular one is included.

Tableau Tricks and Tips

The Movie Data Advent Calendar collects 25 top rated Christmas movies and 25 top rated films that can be considered to be Finnish origin. The calendar is implemented using Tableau map layers and visibility controls applying the techniques introduced by Tableau visionary Annabelle Rincon who has delighted Tableau community with her Advent Calendars already many years in row. For more implementation details please check her blog:

How to do an Advent Christmas Calendar in Tableau… Again!

The data search methodologies used here are not perfect and also the data set, even if it is wide, has some limitations. And yes, a generative AI model would be probably handy here as it would be able to classify the movies in more dimensions, assuming enough background information would be available.

When opening the Movie Data Advent Calendar doors, hope you find there something familiar and something new, the hand picked top rated films are in random order. Please note that due the implementation method the calendar doors open best in laptop computers. Happy movie watching and the final warning: If your taste in movies diverges from the mainstream, these calendar picks may not be for you as crowds of people have spoken. Happy wait of the holidays!