< World Happiness Report

World Happiness Report 2015-2023

Purpose and Context

This project, part of my CareerFoundry Data Analytics Course, aimed to demonstrate practical skills in data sourcing, analysis, and visualization, focusing on the World Happiness Report. Aligned with the course's objective of independent dataset selection, the project utilized Python for exploratory visual analysis, hypothesis development, and advanced analytics. The culmination was a concise Tableau storyboard presentation, showcasing a deep understanding of the data and its implications.

Overview

The World Happiness Report, published by the Sustainable Development Solutions Network, ranks countries annually based on life evaluations from Gallup World Poll data. Life evaluations, measured on the Cantril ladder (0 to 10), consider variables like GDP per capita, social support, healthy life expectancy, freedom, generosity, and corruption. This report is a global reference, sparking international conversations on well-being in policy-making

Chart 1

Objective

To proficiently execute exploratory visual analysis in Python, identifying meaningful connections between variables for further exploration. Formulating hypotheses and applying diverse advanced analytical techniques, the conclusive outcomes are communicated through a concise Tableau storyboard presentation.

Key Questions Addressed:

Tools

Python and libraries (Pandas, NumPy, Matplotlib, SciPy, Seaborn, Folium, Plotly), Tableau, Excel.

Data

Dataset - Kaggle. Primary source - World Happiness Report website.

Steps

1. Cleaning and Merging Data

I began with nine datasets for the World Happiness Report (spanning 2015-2023), each representing a specific year. Utilizing a classic methodology, I merged these datasets, ensuring consistent formatting, addressing missing values, and eliminating duplicates for accurate and reliable data crucial to analysis.

2. Exploring Relationships

Conducting EDA (Exploratory Data Analysis) in Python involved tasks like correlation maps, scatter plots, and pair plots. A notable finding was the highest correlation between happiness score and GDP per Capita, leading to the hypothesis that higher GDP per Capita is associated with higher happiness scores. In order to further investigate this, advanced analytical methods, including regression analysis and k-means clustering, were performed to test and explore these connections.

Exploring Relationships Image

3. Employing Advanced Analytical Methods

Regression Analysis

Regression analysis helps us understand how one or more factors are related to an outcome, allowing us to predict or explain changes in the outcome based on the values of the factors. The results of the regression analysis were showing that our model moderately explains happiness score variability. Yet, nearly half the variance remains unexplained by GDP Per Capita.

Exploring Relationships Image
Cluster Analysis

A cluster analysis groups data points into "clusters". We can then compare the groups of data to uncover new patterns. Utilizing clustering, three distinct groups emerged. Analyzing the chart below, it's evident that countries in the Cluster 0 not only boast the highest median Happiness scores but also lead in GDP Per Capita.

Exploring Relationships Image

4. Spatial Exploration

Integrating a shapefile for spatial representation, I delved into geographic insights regarding the relationship between GDP per Capita and happiness scores. The results supported the hypothesis, confirming that regions with higher GDP were associated with elevated happiness scores.

Exploring Relationships Image

More details about the project:

Challenges

Final Words

In the World Happiness Report project, my journey through the realms of data analytics was both enlightening and with surprises.

What surprised me was the nuanced relationship between happiness and GDP per capita. Initially, I expected a linear correlation, assuming that higher GDP per capita would consistently translate to higher happiness scores. However, the data challenged this assumption, revealing that the happiest countries don't consistently possess the highest GDP per capita. This unexpected finding underscores the complexity of factors influencing happiness.

The unexpected revelation was the profound versatility of Python, not only as a tool for visualization but as a robust platform for advanced analytics, embracing methods like regression analysis and k-means clustering. The experience enriched my understanding of data analytics, teaching me to navigate challenges, validate hypotheses, and communicate findings effectively through Tableau.

Summary

Moving forward, a multivariate exploration, employing techniques such as multiple regression, will provide a nuanced understanding of how combinations of these factors interact to influence happiness, ensuring a comprehensive analysis beyond GDP per Capita alone.

Exploring residual factors, with a focus on cultural, environmental, and political variables, to uncover additional influences on happiness beyond the primary variables considered in the World Happiness Report.