It is not impetuous to say that during his lifetime, the majority of human beings on earth will have tasted the cold and sweet beverage that is called beer. Some may dislike it, some may love it a little bit too much, but the least to say is that everybody has some word about it.
Indeed, not only this drink is universal, but it’s also widely diverse in taste, style, ingredients, …. A real spectrum of variety across the different cultures of the world. Of course, with today mondialisation, every style is not confined to its region of origin anymore, it had the chance to spread across borders and oceans. That let people around the globe able to enjoy any kind they want.
This variety opened an endless debate : what is the best beer? The answer will always be impossible to find, as objectivity does not have its place here : the opinion of someone is greatly biaised by its taste, country of birth, social condition,… . Trying to make your voice heard may make tones and emotions rise and lead to severe tensions, which is the opposite of the desired atmosphere when filling the glasses.
However, another question can be answered : what is the prefered beer of all? Moreover, can people with very different origins find common ground around a full to the brim pint?
Goal
What if we re-arranged the world through beer taste? What would be the map resulting from it? What countries will be linked per their beer’s love? Can natural enemies become friends over a nice cold beverage?
This project will have two phases: the first will be to find the good method to compute the favorite beer style per country. Next, we will cluster the different country to see what region will result from it.
1. What is the favorite beer per country?
First of all let’s talk about the data used for our analysis.

All come from the american website RateBeer, which is, according to it : widely recognized as the most in-depth, accurate, and one of the most-visited source for beer information. RateBeer is a world site for craft beer enthusiasts and is dedicated to serving the entire craft beer community through beer education, promotion and outreach. RateBeeris a consumer-driven Web site and we strive to remain unbiased in our ratings and editorial content.
On this website, users can rate beers on 4 aspects : appearance, aroma, palate and taste, and they can add a text review to comment their opinion. Each user can encode its country (assuming that they entered their real one), which will be very useful here, and write as many ratings as he likes:
Cut the weeds
As we can see, a large number of users only leave one or very few ratings. As our purpose is to compute the prefered style for the users, the ones that leave almost no ratings are of very litte help to achieve it. So, as well as the website methods to compute the average score of a beer (RateBeer Quality assurance), we will need to take out users under a certain threshold of ratings.
Moreover, the same reasonning can be applied for the countries : if we do not have enough users for a certain country, does it make sens to compute its favorite style? Will it really be representative of its population?
Unfortunately we cannot only look at the precision and correctness of our result : indeed, what is the point of having expert users and countries with thousands of reviewers if we end up with 2 countries only? We had to make a compromise between plentiness and data quality: