aboutsummaryrefslogtreecommitdiff
path: root/content/blog/2020-07-20-video-game-sales.org
diff options
context:
space:
mode:
Diffstat (limited to 'content/blog/2020-07-20-video-game-sales.org')
-rw-r--r--content/blog/2020-07-20-video-game-sales.org42
1 files changed, 20 insertions, 22 deletions
diff --git a/content/blog/2020-07-20-video-game-sales.org b/content/blog/2020-07-20-video-game-sales.org
index 672558d..2967c17 100644
--- a/content/blog/2020-07-20-video-game-sales.org
+++ b/content/blog/2020-07-20-video-game-sales.org
@@ -4,10 +4,8 @@
#+filetags: :data:
* Background Information
-This dataset (obtained from
-[[https://www.kaggle.com/gregorut/videogamesales/data][Kaggle]])
-contains a list of video games with sales greater than 100,000 copies.
-It was generated by a scrape of vgchartz.com.
+This dataset (obtained from [[https://www.kaggle.com/gregorut/videogamesales/data][Kaggle]]) contains a list of video games with sales
+greater than 100,000 copies. It was generated by a scrape of vgchartz.com.
Fields include:
@@ -23,8 +21,7 @@ Fields include:
- Other_{Sales}: Sales in the rest of the world (in millions)
- Global_{Sales}: Total worldwide sales.
-There are 16,598 records. 2 records were dropped due to incomplete
-information.
+There are 16,598 records. 2 records were dropped due to incomplete information.
* Import the Data
#+begin_src python
@@ -45,7 +42,8 @@ df
* Explore the Data
#+begin_src python
-# With the description function, we can see the basic stats. For example, we can also see that the 'Year' column has some incomplete values.
+# With the description function, we can see the basic stats. For example, we can
+# also see that the 'Year' column has some incomplete values.
df.describe()
#+end_src
@@ -158,18 +156,18 @@ df3.sort_values(by=['Global_Sales'], ascending=False).head(5)
[[https://img.cleberg.net/blog/20200720-data-exploration-video-game-sales/09_outliers-min.png]]
* Discussion
-The purpose of exploring datasets is to ask questions, answer questions,
-and discover intelligence that can be used to inform decision-making.
-So, what have we found in this dataset?
-
-Today we simply explored a publicly-available dataset to see what kind
-of information it contained. During that exploration, we found that
-video game sales peaked in 2006. That peak was largely due to Nintendo,
-who sold the top 5 games in 2006 and has a number of games in the top-10
-list for the years 1980-2020. Additionally, the top four platforms by
-global sales (Wii, NES, GB, DS) are owned by Nintendo.
-
-We didn't explore everything this dataset has to offer, but we can tell
-from a brief analysis that Nintendo seems to rule sales in the video
-gaming world. Further analysis could provide insight into which genres,
-regions, publishers, or world events are correlated with sales.
+The purpose of exploring datasets is to ask questions, answer questions, and
+discover intelligence that can be used to inform decision-making. So, what have
+we found in this dataset?
+
+Today we simply explored a publicly-available dataset to see what kind of
+information it contained. During that exploration, we found that video game
+sales peaked in 2006. That peak was largely due to Nintendo, who sold the top 5
+games in 2006 and has a number of games in the top-10 list for the years
+1980-2020. Additionally, the top four platforms by global sales (Wii, NES, GB,
+DS) are owned by Nintendo.
+
+We didn't explore everything this dataset has to offer, but we can tell from a
+brief analysis that Nintendo seems to rule sales in the video gaming world.
+Further analysis could provide insight into which genres, regions, publishers,
+or world events are correlated with sales.