diff options
Diffstat (limited to 'content/blog/2020-09-25-happiness-map.org')
-rw-r--r-- | content/blog/2020-09-25-happiness-map.org | 212 |
1 files changed, 0 insertions, 212 deletions
diff --git a/content/blog/2020-09-25-happiness-map.org b/content/blog/2020-09-25-happiness-map.org deleted file mode 100644 index 1f2b56f..0000000 --- a/content/blog/2020-09-25-happiness-map.org +++ /dev/null @@ -1,212 +0,0 @@ -#+title: Data Visualization: World Choropleth Map of Happiness -#+date: 2020-09-25 -#+description: Exploring and visualizing data with Python. -#+filetags: :data: - -* Background Information -The dataset (obtained from [[https://www.kaggle.com/unsdsn/world-happiness][Kaggle]]) used in this article contains a list of -countries around the world, their happiness rankings and scores, as well as -other national scoring measures. - -Fields include: - -- Overall rank -- Country or region -- GDP per capita -- Social support -- Healthy life expectancy -- Freedom to make life choices -- Generosity -- Perceptions of corruption - -There are 156 records. Since there are ~195 countries in the world, we can see -that around 40 countries will be missing from this dataset. - -* Install Packages -As always, run the =install= command for all packages needed to perform -analysis. - -#+begin_src python -!pip install folium geopandas matplotlib numpy pandas -#+end_src - -* Import the Data -We only need a couple packages to create a choropleth map. We will use [[https://python-visualization.github.io/folium/][Folium]], -which provides map visualizations in Python. We will also use geopandas and -pandas to wrangle our data before we put it on a map. - -#+begin_src python -# Import the necessary Python packages -import folium -import geopandas as gpd -import pandas as pd -#+end_src - -To get anything to show up on a map, we need a file that will specify the -boundaries of each country. Luckily, GeoJSON files exist (for free!) on the -internet. To get the boundaries of every country in the world, we will use the -GeoJSON link shown below. - -GeoPandas will take this data and load it into a dataframe so that we can easily -match it to the data we're trying to analyze. Let's look at the GeoJSON -dataframe: - -#+begin_src python -# Load the GeoJSON data with geopandas -geo_data = gpd.read_file('https://raw.githubusercontent.com/datasets/geo-countries/master/data/countries.geojson') -geo_data.head() -#+end_src - -#+caption: GeoJSON Dataframe -[[https://img.cleberg.net/blog/20200925-world-choropleth-map/geojson_df.png]] - -Next, let's load the data from the Kaggle dataset. I've downloaded this file, so -update the file path if you have it somewhere else. After loading, let's take a -look at this dataframe: - -#+begin_src python -# Load the world happiness data with pandas -happy_data = pd.read_csv(r'~/Downloads/world_happiness_data_2019.csv') -happy_data.head() -#+end_src - -#+caption: Happiness Dataframe -[[https://img.cleberg.net/blog/20200925-world-choropleth-map/happiness_df.png]] - -* Clean the Data -Some countries need to be renamed, or they will be lost when you merge the -happiness and GeoJSON dataframes. This is something I discovered when the map -below showed empty countries. I searched both data frames for the missing -countries to see the naming differences. Any countries that do not have records -in the =happy_data= df will not show up on the map. - -#+begin_src python -# Rename some countries to match our GeoJSON data - -# Rename USA -usa_index = happy_data.index[happy_data['Country or region'] == 'United States'] -happy_data.at[usa_index, 'Country or region'] = 'United States of America' - -# Rename Tanzania -tanzania_index = happy_data.index[happy_data['Country or region'] == 'Tanzania'] -happy_data.at[tanzania_index, 'Country or region'] = 'United Republic of Tanzania' - -# Rename the Congo -republic_congo_index = happy_data.index[happy_data['Country or region'] == 'Congo (Brazzaville)'] -happy_data.at[republic_congo_index, 'Country or region'] = 'Republic of Congo' - -# Rename the DRC -democratic_congo_index = happy_data.index[happy_data['Country or region'] == 'Congo (Kinshasa)'] -happy_data.at[democratic_congo_index, 'Country or region'] = 'Democratic Republic of the Congo' -#+end_src - -* Merge the Data -Now that we have clean data, we need to merge the GeoJSON data with the -happiness data. Since we've stored them both in dataframes, we just need to call -the =.merge()= function. - -We will also rename a couple columns, just so that they're a little easier to -use when we create the map. - -#+begin_src python -# Merge the two previous dataframes into a single geopandas dataframe -merged_df = geo_data.merge(happy_data,left_on='ADMIN', right_on='Country or region') - -# Rename columns for ease of use -merged_df = merged_df.rename(columns = {'ADMIN':'GeoJSON_Country'}) -merged_df = merged_df.rename(columns = {'Country or region':'Country'}) -#+end_src - -#+caption: Merged Dataframe -[[https://img.cleberg.net/blog/20200925-world-choropleth-map/merged_df.png]] - -* Create the Map -The data is finally ready to be added to a map. The code below shows the -simplest way to find the center of the map and create a Folium map object. The -important part is to remember to reference the merged dataframe for our GeoJSON -data and value data. The columns specify which geo data and value data to use. - -#+begin_src python -# Assign centroids to map -x_map = merged_df.centroid.x.mean() -y_map = merged_df.centroid.y.mean() -print(x_map,y_map) - -# Creating a map object -world_map = folium.Map(location=[y_map, x_map], zoom_start=2,tiles=None) -folium.TileLayer('CartoDB positron',name='Dark Map',control=False).add_to(world_map) - -# Creating choropleth map -folium.Choropleth( - geo_data=merged_df, - name='Choropleth', - data=merged_df, - columns=['Country','Overall rank'], - key_on='feature.properties.Country', - fill_color='YlOrRd', - fill_opacity=0.6, - line_opacity=0.8, - legend_name='Overall happiness rank', - smooth_factor=0, - highlight=True -).add_to(world_map) -#+end_src - -Let's look at the resulting map. - -#+caption: Choropleth Map -[[https://img.cleberg.net/blog/20200925-world-choropleth-map/map.png]] - -* Create a Tooltip on Hover -Now that we have a map set up, we could stop. However, I want to add a tooltip -so that I can see more information about each country. The =tooltip_data= code -below will show a popup on hover with all the data fields shown. - -#+begin_src python - # Adding labels to map - style_function = lambda x: {'fillColor': '#ffffff', - 'color':'#000000', - 'fillOpacity': 0.1, - 'weight': 0.1} - -tooltip_data = folium.features.GeoJson( - merged_df, - style_function=style_function, - control=False, - tooltip=folium.features.GeoJsonTooltip( - fields=['Country' - ,'Overall rank' - ,'Score' - ,'GDP per capita' - ,'Social support' - ,'Healthy life expectancy' - ,'Freedom to make life choices' - ,'Generosity' - ,'Perceptions of corruption' - ], - aliases=['Country: ' - ,'Happiness rank: ' - ,'Happiness score: ' - ,'GDP per capita: ' - ,'Social support: ' - ,'Healthy life expectancy: ' - ,'Freedom to make life choices: ' - ,'Generosity: ' - ,'Perceptions of corruption: ' - ], - style=('background-color: white; color: #333333; font-family: arial; font-size: 12px; padding: 10px;') - ) -) -world_map.add_child(tooltip_data) -world_map.keep_in_front(tooltip_data) -folium.LayerControl().add_to(world_map) - -# Display the map -world_map -#+end_src - -The final image below will show you what the tooltip looks like whenever you -hover over a country. - -#+caption: Choropleth Map Tooltip -[[https://img.cleberg.net/blog/20200925-world-choropleth-map/tooltip_map.png]] |