aboutsummaryrefslogtreecommitdiff
path: root/blog/happiness-map/index.org
diff options
context:
space:
mode:
Diffstat (limited to 'blog/happiness-map/index.org')
-rw-r--r--blog/happiness-map/index.org217
1 files changed, 217 insertions, 0 deletions
diff --git a/blog/happiness-map/index.org b/blog/happiness-map/index.org
new file mode 100644
index 0000000..1eab63e
--- /dev/null
+++ b/blog/happiness-map/index.org
@@ -0,0 +1,217 @@
+#+title: Data Visualization: World Choropleth Map of Happiness
+#+date: 2020-09-25
+#+description: Exploring and visualizing data with Python.
+#+filetags: :data:
+
+* Background Information
+The dataset (obtained from
+[[https://www.kaggle.com/unsdsn/world-happiness][Kaggle]]) used in this
+article contains a list of countries around the world, their happiness
+rankings and scores, as well as other national scoring measures.
+
+Fields include:
+
+- Overall rank
+- Country or region
+- GDP per capita
+- Social support
+- Healthy life expectancy
+- Freedom to make life choices
+- Generosity
+- Perceptions of corruption
+
+There are 156 records. Since there are ~195 countries in the world, we
+can see that around 40 countries will be missing from this dataset.
+
+* Install Packages
+As always, run the =install= command for all packages needed to perform
+analysis.
+
+#+begin_src python
+!pip install folium geopandas matplotlib numpy pandas
+#+end_src
+
+* Import the Data
+We only need a couple packages to create a choropleth map. We will use
+[[https://python-visualization.github.io/folium/][Folium]], which
+provides map visualizations in Python. We will also use geopandas and
+pandas to wrangle our data before we put it on a map.
+
+#+begin_src python
+# Import the necessary Python packages
+import folium
+import geopandas as gpd
+import pandas as pd
+#+end_src
+
+To get anything to show up on a map, we need a file that will specify
+the boundaries of each country. Luckily, GeoJSON files exist (for free!)
+on the internet. To get the boundaries of every country in the world, we
+will use the GeoJSON link shown below.
+
+GeoPandas will take this data and load it into a dataframe so that we
+can easily match it to the data we're trying to analyze. Let's look at
+the GeoJSON dataframe:
+
+#+begin_src python
+# Load the GeoJSON data with geopandas
+geo_data = gpd.read_file('https://raw.githubusercontent.com/datasets/geo-countries/master/data/countries.geojson')
+geo_data.head()
+#+end_src
+
+#+caption: GeoJSON Dataframe
+[[https://img.cleberg.net/blog/20200925-world-choropleth-map/geojson_df.png]]
+
+Next, let's load the data from the Kaggle dataset. I've downloaded this
+file, so update the file path if you have it somewhere else. After
+loading, let's take a look at this dataframe:
+
+#+begin_src python
+# Load the world happiness data with pandas
+happy_data = pd.read_csv(r'~/Downloads/world_happiness_data_2019.csv')
+happy_data.head()
+#+end_src
+
+#+caption: Happiness Dataframe
+[[https://img.cleberg.net/blog/20200925-world-choropleth-map/happiness_df.png]]
+
+* Clean the Data
+Some countries need to be renamed, or they will be lost when you merge
+the happiness and GeoJSON dataframes. This is something I discovered
+when the map below showed empty countries. I searched both data frames
+for the missing countries to see the naming differences. Any countries
+that do not have records in the =happy_data= df will not show up on the
+map.
+
+#+begin_src python
+# Rename some countries to match our GeoJSON data
+
+# Rename USA
+usa_index = happy_data.index[happy_data['Country or region'] == 'United States']
+happy_data.at[usa_index, 'Country or region'] = 'United States of America'
+
+# Rename Tanzania
+tanzania_index = happy_data.index[happy_data['Country or region'] == 'Tanzania']
+happy_data.at[tanzania_index, 'Country or region'] = 'United Republic of Tanzania'
+
+# Rename the Congo
+republic_congo_index = happy_data.index[happy_data['Country or region'] == 'Congo (Brazzaville)']
+happy_data.at[republic_congo_index, 'Country or region'] = 'Republic of Congo'
+
+# Rename the DRC
+democratic_congo_index = happy_data.index[happy_data['Country or region'] == 'Congo (Kinshasa)']
+happy_data.at[democratic_congo_index, 'Country or region'] = 'Democratic Republic of the Congo'
+#+end_src
+
+* Merge the Data
+Now that we have clean data, we need to merge the GeoJSON data with the
+happiness data. Since we've stored them both in dataframes, we just need
+to call the =.merge()= function.
+
+We will also rename a couple columns, just so that they're a little
+easier to use when we create the map.
+
+#+begin_src python
+# Merge the two previous dataframes into a single geopandas dataframe
+merged_df = geo_data.merge(happy_data,left_on='ADMIN', right_on='Country or region')
+
+# Rename columns for ease of use
+merged_df = merged_df.rename(columns = {'ADMIN':'GeoJSON_Country'})
+merged_df = merged_df.rename(columns = {'Country or region':'Country'})
+#+end_src
+
+#+caption: Merged Dataframe
+[[https://img.cleberg.net/blog/20200925-world-choropleth-map/merged_df.png]]
+
+* Create the Map
+The data is finally ready to be added to a map. The code below shows the
+simplest way to find the center of the map and create a Folium map
+object. The important part is to remember to reference the merged
+dataframe for our GeoJSON data and value data. The columns specify which
+geo data and value data to use.
+
+#+begin_src python
+# Assign centroids to map
+x_map = merged_df.centroid.x.mean()
+y_map = merged_df.centroid.y.mean()
+print(x_map,y_map)
+
+# Creating a map object
+world_map = folium.Map(location=[y_map, x_map], zoom_start=2,tiles=None)
+folium.TileLayer('CartoDB positron',name='Dark Map',control=False).add_to(world_map)
+
+# Creating choropleth map
+folium.Choropleth(
+ geo_data=merged_df,
+ name='Choropleth',
+ data=merged_df,
+ columns=['Country','Overall rank'],
+ key_on='feature.properties.Country',
+ fill_color='YlOrRd',
+ fill_opacity=0.6,
+ line_opacity=0.8,
+ legend_name='Overall happiness rank',
+ smooth_factor=0,
+ highlight=True
+).add_to(world_map)
+#+end_src
+
+Let's look at the resulting map.
+
+#+caption: Choropleth Map
+[[https://img.cleberg.net/blog/20200925-world-choropleth-map/map.png]]
+
+* Create a Tooltip on Hover
+Now that we have a map set up, we could stop. However, I want to add a
+tooltip so that I can see more information about each country. The
+=tooltip_data= code below will show a popup on hover with all the data
+fields shown.
+
+#+begin_src python
+ # Adding labels to map
+ style_function = lambda x: {'fillColor': '#ffffff',
+ 'color':'#000000',
+ 'fillOpacity': 0.1,
+ 'weight': 0.1}
+
+tooltip_data = folium.features.GeoJson(
+ merged_df,
+ style_function=style_function,
+ control=False,
+ tooltip=folium.features.GeoJsonTooltip(
+ fields=['Country'
+ ,'Overall rank'
+ ,'Score'
+ ,'GDP per capita'
+ ,'Social support'
+ ,'Healthy life expectancy'
+ ,'Freedom to make life choices'
+ ,'Generosity'
+ ,'Perceptions of corruption'
+ ],
+ aliases=['Country: '
+ ,'Happiness rank: '
+ ,'Happiness score: '
+ ,'GDP per capita: '
+ ,'Social support: '
+ ,'Healthy life expectancy: '
+ ,'Freedom to make life choices: '
+ ,'Generosity: '
+ ,'Perceptions of corruption: '
+ ],
+ style=('background-color: white; color: #333333; font-family: arial; font-size: 12px; padding: 10px;')
+ )
+)
+world_map.add_child(tooltip_data)
+world_map.keep_in_front(tooltip_data)
+folium.LayerControl().add_to(world_map)
+
+# Display the map
+world_map
+#+end_src
+
+The final image below will show you what the tooltip looks like whenever
+you hover over a country.
+
+#+caption: Choropleth Map Tooltip
+[[https://img.cleberg.net/blog/20200925-world-choropleth-map/tooltip_map.png]]