aboutsummaryrefslogtreecommitdiff
path: root/blog/2020-09-25-happiness-map.org
diff options
context:
space:
mode:
authorChristian Cleberg <hello@cleberg.net>2023-12-02 11:23:08 -0600
committerChristian Cleberg <hello@cleberg.net>2023-12-02 11:23:08 -0600
commitcaccd81c3eb7954662d20cab10cc3afeeabca615 (patch)
tree567ed10350c1ee319c178952ab6aa48265977e58 /blog/2020-09-25-happiness-map.org
downloadcleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.gz
cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.bz2
cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.zip
initial commit
Diffstat (limited to 'blog/2020-09-25-happiness-map.org')
-rw-r--r--blog/2020-09-25-happiness-map.org217
1 files changed, 217 insertions, 0 deletions
diff --git a/blog/2020-09-25-happiness-map.org b/blog/2020-09-25-happiness-map.org
new file mode 100644
index 0000000..d511d9d
--- /dev/null
+++ b/blog/2020-09-25-happiness-map.org
@@ -0,0 +1,217 @@
+#+date: 2020-09-25
+#+title: Data Visualization: World Choropleth Map of Happiness
+
+* Background Information
+
+The dataset (obtained from [[https://www.kaggle.com/unsdsn/world-happiness][Kaggle]]) used in this article contains a list of
+countries around the world, their happiness rankings and scores, as well as
+other national scoring measures.
+
+Fields include:
+
+- Overall rank
+- Country or region
+- GDP per capita
+- Social support
+- Healthy life expectancy
+- Freedom to make life choices
+- Generosity
+- Perceptions of corruption
+
+There are 156 records. Since there are ~195 countries in the world, we can see
+that around 40 countries will be missing from this dataset.
+
+* Install Packages
+
+As always, run the =install= command for all packages needed to perform
+analysis.
+
+#+BEGIN_SRC python
+!pip install folium geopandas matplotlib numpy pandas
+#+END_SRC
+
+* Import the Data
+
+We only need a couple packages to create a choropleth map. We will use [[https://python-visualization.github.io/folium/][Folium]],
+which provides map visualizations in Python. We will also use geopandas and
+pandas to wrangle our data before we put it on a map.
+
+#+BEGIN_SRC python
+# Import the necessary Python packages
+import folium
+import geopandas as gpd
+import pandas as pd
+#+END_SRC
+
+To get anything to show up on a map, we need a file that will specify the
+boundaries of each country. Luckily, GeoJSON files exist (for free!) on the
+internet. To get the boundaries of every country in the world, we will use the
+GeoJSON link shown below.
+
+GeoPandas will take this data and load it into a dataframe so that we can easily
+match it to the data we're trying to analyze. Let's look at the GeoJSON
+dataframe:
+
+#+BEGIN_SRC python
+# Load the GeoJSON data with geopandas
+geo_data = gpd.read_file('https://raw.githubusercontent.com/datasets/geo-countries/master/data/countries.geojson')
+geo_data.head()
+#+END_SRC
+
+#+CAPTION: GeoJSON Dataframe
+[[https://img.0x4b1d.org/blog/20200925-world-choropleth-map/geojson_df.png]]
+
+Next, let's load the data from the Kaggle dataset. I've downloaded this file, so
+update the file path if you have it somewhere else. After loading, let's take a
+look at this dataframe:
+
+#+BEGIN_SRC python
+# Load the world happiness data with pandas
+happy_data = pd.read_csv(r'~/Downloads/world_happiness_data_2019.csv')
+happy_data.head()
+#+END_SRC
+
+#+CAPTION: Happiness Dataframe
+[[https://img.0x4b1d.org/blog/20200925-world-choropleth-map/happiness_df.png]]
+
+* Clean the Data
+
+Some countries need to be renamed, or they will be lost when you merge the
+happiness and GeoJSON dataframes. This is something I discovered when the map
+below showed empty countries. I searched both data frames for the missing
+countries to see the naming differences. Any countries that do not have records
+in the =happy_data= df will not show up on the map.
+
+#+BEGIN_SRC python
+# Rename some countries to match our GeoJSON data
+
+# Rename USA
+usa_index = happy_data.index[happy_data['Country or region'] == 'United States']
+happy_data.at[usa_index, 'Country or region'] = 'United States of America'
+
+# Rename Tanzania
+tanzania_index = happy_data.index[happy_data['Country or region'] == 'Tanzania']
+happy_data.at[tanzania_index, 'Country or region'] = 'United Republic of Tanzania'
+
+# Rename the Congo
+republic_congo_index = happy_data.index[happy_data['Country or region'] == 'Congo (Brazzaville)']
+happy_data.at[republic_congo_index, 'Country or region'] = 'Republic of Congo'
+
+# Rename the DRC
+democratic_congo_index = happy_data.index[happy_data['Country or region'] == 'Congo (Kinshasa)']
+happy_data.at[democratic_congo_index, 'Country or region'] = 'Democratic Republic of the Congo'
+#+END_SRC
+
+* Merge the Data
+
+Now that we have clean data, we need to merge the GeoJSON data with the
+happiness data. Since we've stored them both in dataframes, we just need to call
+the =.merge()= function.
+
+We will also rename a couple columns, just so that they're a little easier to
+use when we create the map.
+
+#+BEGIN_SRC python
+# Merge the two previous dataframes into a single geopandas dataframe
+merged_df = geo_data.merge(happy_data,left_on='ADMIN', right_on='Country or region')
+
+# Rename columns for ease of use
+merged_df = merged_df.rename(columns = {'ADMIN':'GeoJSON_Country'})
+merged_df = merged_df.rename(columns = {'Country or region':'Country'})
+#+END_SRC
+
+#+CAPTION: Merged Dataframe
+[[https://img.0x4b1d.org/blog/20200925-world-choropleth-map/merged_df.png]]
+
+* Create the Map
+
+The data is finally ready to be added to a map. The code below shows the
+simplest way to find the center of the map and create a Folium map object. The
+important part is to remember to reference the merged dataframe for our GeoJSON
+data and value data. The columns specify which geo data and value data to use.
+
+#+BEGIN_SRC python
+# Assign centroids to map
+x_map = merged_df.centroid.x.mean()
+y_map = merged_df.centroid.y.mean()
+print(x_map,y_map)
+
+# Creating a map object
+world_map = folium.Map(location=[y_map, x_map], zoom_start=2,tiles=None)
+folium.TileLayer('CartoDB positron',name='Dark Map',control=False).add_to(world_map)
+
+# Creating choropleth map
+folium.Choropleth(
+ geo_data=merged_df,
+ name='Choropleth',
+ data=merged_df,
+ columns=['Country','Overall rank'],
+ key_on='feature.properties.Country',
+ fill_color='YlOrRd',
+ fill_opacity=0.6,
+ line_opacity=0.8,
+ legend_name='Overall happiness rank',
+ smooth_factor=0,
+ highlight=True
+).add_to(world_map)
+#+END_SRC
+
+Let's look at the resulting map.
+
+#+CAPTION: Choropleth Map
+[[https://img.0x4b1d.org/blog/20200925-world-choropleth-map/map.png]]
+
+* Create a Tooltip on Hover
+
+Now that we have a map set up, we could stop. However, I want to add a tooltip
+so that I can see more information about each country. The =tooltip_data= code
+below will show a popup on hover with all the data fields shown.
+
+#+BEGIN_SRC python
+ # Adding labels to map
+ style_function = lambda x: {'fillColor': '#ffffff',
+ 'color':'#000000',
+ 'fillOpacity': 0.1,
+ 'weight': 0.1}
+
+tooltip_data = folium.features.GeoJson(
+ merged_df,
+ style_function=style_function,
+ control=False,
+ tooltip=folium.features.GeoJsonTooltip(
+ fields=['Country'
+ ,'Overall rank'
+ ,'Score'
+ ,'GDP per capita'
+ ,'Social support'
+ ,'Healthy life expectancy'
+ ,'Freedom to make life choices'
+ ,'Generosity'
+ ,'Perceptions of corruption'
+ ],
+ aliases=['Country: '
+ ,'Happiness rank: '
+ ,'Happiness score: '
+ ,'GDP per capita: '
+ ,'Social support: '
+ ,'Healthy life expectancy: '
+ ,'Freedom to make life choices: '
+ ,'Generosity: '
+ ,'Perceptions of corruption: '
+ ],
+ style=('background-color: white; color: #333333; font-family: arial; font-size: 12px; padding: 10px;')
+ )
+)
+world_map.add_child(tooltip_data)
+world_map.keep_in_front(tooltip_data)
+folium.LayerControl().add_to(world_map)
+
+# Display the map
+world_map
+#+END_SRC
+
+The final image below will show you what the tooltip looks like whenever you
+hover over a country.
+
+#+CAPTION: Choropleth Map Tooltip
+[[https://img.0x4b1d.org/blog/20200925-world-choropleth-map/tooltip_map.png]]