This project analyzes data on video game critic scores, user ratings, and sales for the top 400 video games released since 1977. The goal of this project is to explore whether games are improving over time or if the golden age of video games has already passed. By examining trends in review scores from both critics and users, as well as analyzing global sales data, this project aims to uncover the most celebrated eras in gaming history and provide insights into the business dynamics behind successful games.
The database contains four tables. Each table has been limited to 400 rows for this project, but we can find the complete dataset with over 13,000 games on Kaggle.
game_sales
tableColumn | Definition | Data Type |
---|---|---|
name |
Name of the video game | varchar |
platform |
Gaming platform | varchar |
publisher |
Game publisher | varchar |
developer |
Game developer | varchar |
games_sold |
Number of copies sold (millions) | float |
year |
Release year | int |
reviews
tableColumn | Definition | Data Type |
---|---|---|
name |
Name of the video game | varchar |
critic_score |
Critic score according to Metacritic | float |
user_score |
User score according to Metacritic | float |
users_avg_year_rating
tableColumn | Definition | Data Type |
---|---|---|
year |
Release year of the games reviewed | int |
num_games |
Number of games released that year | int |
avg_user_score |
Average score of all the games ratings for the year | float |
critics_avg_year_rating
tableColumn | Definition | Data Type |
---|---|---|
year |
Release year of the games reviewed | int |
num_games |
Number of games released that year | int |
avg_critic_score |
Average score of all the games ratings for the year | float |
EDA involved exploring the database to answer key questions, such as:
Including some interesting code/features worked with
-- best_selling_games
SELECT *
FROM game_sales
ORDER BY games_sold DESC
LIMIT 10;
-- critics_top_ten_years
SELECT gs.year AS year,
COUNT(gs.name) AS num_games,
ROUND(AVG(r.critic_score),2) AS avg_critic_score
FROM game_sales AS gs
INNER JOIN reviews AS r
ON gs.name = r.name
GROUP BY gs.year
HAVING COUNT(gs.name) > 4
ORDER BY avg_critic_score DESC
LIMIT 10;
-- golden_years
SELECT ur.year,
ur.num_games,
cr.avg_critic_score,
ur.avg_user_score,
ur.avg_user_score - cr.avg_critic_score AS diff
FROM users_avg_year_rating AS ur
INNER JOIN critics_avg_year_rating AS cr
ON ur.year = cr.year
WHERE avg_critic_score > 9 OR avg_user_score > 9
ORDER BY year ASC;
The analysis results are summarized as follows:
Based on the analysis, we recommend the following actions: