import pandas as pd
import plotly.express as px
= pd.read_csv('archive/medallists.csv')
medalists = medalists[['country_code', 'country']].drop_duplicates().set_index('country_code')
country_mapping
= pd.read_csv('archive/medals_total.csv')
medals_total = medals_total.sort_values('Total', ascending=False).head(10)
top_10 'country'] = top_10['country_code'].map(country_mapping['country'])
top_10[
= px.bar(top_10, x='country',
fig =['Bronze Medal', 'Silver Medal', 'Gold Medal'],
y='Top 10 countries by total medals',
title='stack',
barmode={
color_discrete_map'Gold Medal': '#FFD700',
'Silver Medal': '#C0C0C0',
'Bronze Medal': '#CD7F32'
})
= 'Country'
fig.layout.xaxis.title.text = 'Total medals'
fig.layout.yaxis.title.text = 'Medal type'
fig.layout.legend.title.text fig.show()
On July 26, 2024, the Paris 2024 Olympics commenced, sending lights of hope into the hearts of every athlete. As the world watched in anticipation, dreams were forged and destinies were shaped on the grandest stage. This notebook, titled “What Makes Olympic Winners?”, delves into the results of the 2024 Olympic Games to uncover the patterns and secrets behind the triumphs. Using data analysis, we aim to understand the factors contributing to Olympic success, offering insights that could inspire future champions. Join us as we explore the data and stories behind the medals, celebrating the spirit of excellence that defines the Olympics.
Who Won?
Let’s start the journey by exploring the countries that achieved the most medals in the games.
The chart reveals a dominant performance by the United States, securing the top spot with a total of 122 medals. China closely followed, finishing second overall. While both nations tied for the gold medal count, the United States significantly outpaced China in silver and bronze medals, contributing to their overall victory.
To visualize the global medal distribution, let’s turn our attention to an interactive map.This geographical perspective provides valuable insights into the worldwide reach of Olympic medals.
= pd.read_csv('./archive/medals_total.csv')
medals_total
medals_total.head()
= px.choropleth(medals_total, locations='country_code', color='Total', hover_name='country_code',
fig ='natural earth', color_continuous_scale=px.colors.sequential.Blues,
projection='Total Medals by Country')
title fig.show()
Athlete Delegation Size: Does it Matter?
The number of athletes a country sends to the Olympics is a strategic decision influenced by various factors, including a nation’s sporting culture, financial resources, and performance history. A larger delegation can theoretically increase the chances of winning medals, but does this correlation hold true in practice? Let’s explore the relationship between athlete delegation size and Olympic success.
import numpy as np
import plotly.graph_objs as go
= pd.read_csv('./archive/athletes.csv')
athletes = athletes.groupby('country_code').size().reset_index(name='count')
grouped_athletes = grouped_athletes.sort_values('count', ascending=False)
grouped_athletes
def get_country_from_code(code):
return athletes[athletes['country_code'] == code]['country_long'].values[0]
'country'] = grouped_athletes['country_code'].apply(lambda x: get_country_from_code(x))
grouped_athletes[= pd.merge(grouped_athletes, medals_total, on='country_code')
merged
= np.polyfit(merged['count'], merged['Total'], 1)
m, b
= px.scatter(merged, x='count', y='Total', hover_name='country_long',
fig ='Number of Athletes vs Number of Medals')
title='Number of athletes')
fig.update_xaxes(title_text='Number of medals')
fig.update_yaxes(title_text=merged['count'], y=m * merged['count'] + b, mode='lines',
fig.add_traces(go.Scatter(x='Line of best fit'))
name fig.show()
The scatter plot vividly illustrates a strong positive correlation between the number of athletes a country sends to the Olympics and its total medal count. This suggests that larger delegations tend to perform better in terms of overall medals count. However, correlation does not imply causation.
To quantify the strength of this relationship more precisely, we will calculate the Pearson correlation coefficient, a statistical measure that assesses the linear association between two variables.
= merged['count'].corr(merged['Total'])
correlation correlation
np.float64(0.861893038412591)
The calculated Pearson correlation coefficient of 0.858 between athlete delegation size and total medal count confirms a strong positive relationship. This high value indicates that as the number of athletes representing a country increases, there is a substantial likelihood of a corresponding increase in the total medals won. However, it’s crucial to remember that correlation does not establish causation. Other factors, such as the quality of training, level of competition, and specific sporting strengths of a nation, undoubtedly contribute to Olympic success.
Do Physical Attributes Predict National Sporting Strengths?
The Olympics showcase the pinnacle of human athleticism, a stage where competitors push their physical capabilities beyond seemingly insurmountable limits. But what factors contribute to a nation’s dominance in specific events? In this section, we delve into the relationship between physical attributes and the countries that consistently top the podium. By analyzing the data, we aim to uncover potential patterns and identify the strengths of different nations in various physical domains.
To analyze the correlation between physical attributes and Olympic success, we’ve categorized sports based on the primary physical demands they require. Eight distinct physical characteristics were identified: - Power - Endurance - Speed - Skill - Water-based Abilities - Board-based Abilities - Combination of Skills - Team Dynamics
A sport-specific mapping was created to classify each Olympic sport into one or more of these categories. For instance, weightlifting is primarily a power sport, while cycling is an endurance sport. Some sports, like the modern pentathlon, encompass multiple physical characteristics and are thus categorized as combination sports. This classification system provides a framework for investigating the relationship between physical attributes and the countries that excel in different sporting domains.
Power Sports
We begin our exploration of physical attributes with power sports, a category demanding explosive strength and muscular power. This group encompasses disciplines such as weightlifting, boxing, judo, karate, taekwondo, and wrestling. By analyzing the medal distribution within these sports, we aim to identify countries that consistently excel in events requiring raw physical power.
To visualize the countries dominating power sports, let’s examine the top ten medal earners in this category.
= pd.read_csv('archive/medals.csv')
medals
= {
discipline_to_sport_family "3x3 Basketball": "Basketball",
"Archery": "Archery",
"Artistic Gymnastics": "Gymnastics",
"Artistic Swimming": "Aquatics",
"Athletics": "Athletics",
"Badminton": "Badminton",
"Baseball/Softball": "Baseball/Softball",
"Basketball": "Basketball",
"Beach Volleyball": "Volleyball",
"Boxing": "Boxing",
"Breaking": "DanceSport",
"Canoe Slalom": "Canoeing",
"Canoe Sprint": "Canoeing",
"Cycling BMX Freestyle": "Cycling",
"Cycling BMX Racing": "Cycling",
"Cycling Mountain Bike": "Cycling",
"Cycling Road": "Cycling",
"Cycling Track": "Cycling",
"Diving": "Aquatics",
"Equestrian": "Equestrian",
"Fencing": "Fencing",
"Football": "Football",
"Golf": "Golf",
"Handball": "Handball",
"Hockey": "Hockey",
"Judo": "Judo",
"Karate": "Karate",
"Marathon Swimming": "Aquatics",
"Modern Pentathlon": "Modern Pentathlon",
"Rhythmic Gymnastics": "Gymnastics",
"Rowing": "Rowing",
"Rugby Sevens": "Rugby",
"Sailing": "Sailing",
"Shooting": "Shooting",
"Skateboarding": "Skateboarding",
"Sport Climbing": "Climbing",
"Surfing": "Surfing",
"Swimming": "Aquatics",
"Table Tennis": "Table Tennis",
"Taekwondo": "Taekwondo",
"Tennis": "Tennis",
"Trampoline Gymnastics": "Gymnastics",
"Triathlon": "Triathlon",
"Volleyball": "Volleyball",
"Water Polo": "Aquatics",
"Weightlifting": "Weightlifting",
"Wrestling": "Wrestling"
}
'sport_family'] = medals['discipline'].map(discipline_to_sport_family)
medals[
= {
sport_groups "Power Sports": ["Weightlifting", "Boxing", "Judo", "Karate", "Taekwondo", "Wrestling"],
"Endurance Sports": ["Cycling", "Rowing", "Triathlon"],
"Speed Sports": ["Athletics", "Swimming", "Basketball", "Handball", "Hockey", "Football", "Rugby"],
"Skill Sports": ["Gymnastics", "Fencing", "Golf", "Shooting", "Archery", "Table Tennis", "Badminton", "Tennis", "Baseball/Softball"],
"Water Sports": ["Aquatics", "Canoeing", "Sailing", "Surfing"],
"Board Sports": ["Skateboarding", "Surfing"],
"Combination Sports": ["Modern Pentathlon"],
"Team Sports": ["Basketball", "Volleyball", "Handball", "Hockey", "Football", "Rugby", "Baseball/Softball"]
}
= medals[medals['sport_family'].isin(sport_groups['Power Sports'])]
power_sports = medals[medals['sport_family'].isin(sport_groups['Endurance Sports'])]
endurance_sports = medals[medals['sport_family'].isin(sport_groups['Speed Sports'])]
speed_sports = medals[medals['sport_family'].isin(sport_groups['Skill Sports'])]
skill_sports = medals[medals['sport_family'].isin(sport_groups['Water Sports'])]
water_sports = medals[medals['sport_family'].isin(sport_groups['Board Sports'])]
board_sports = medals[medals['sport_family'].isin(sport_groups['Combination Sports'])]
combination_sports = medals[medals['sport_family'].isin(sport_groups['Team Sports'])]
team_sports
def plot_top_10_medals(df, title):
= df['country_code'].value_counts().head(10)
top_10_countries # Use different color for each country
= go.Figure(data=[go.Bar(
fig =top_10_countries.index,
x=top_10_countries,
y=px.colors.qualitative.Set3[:10]
marker_color
)])=title, xaxis_title='Country', yaxis_title='Number of Medals',
fig.update_layout(title=dict(tickmode='array',
xaxis=top_10_countries.index,
tickvals=country_mapping.loc[top_10_countries.index, 'country']))
ticktext
fig.show()
'Top 10 countries in Power Sports') plot_top_10_medals(power_sports,
The bar chart illustrates the medal distribution among the top ten countries in power sports. Japan and China dominate the category, securing the highest medal counts. France and Uzbekistan follow closely behind, while the United States, Iran, Korea, Cuba, Brazil, and Georgia complete the top ten. It’s evident that Asian countries hold a strong presence in power sports, with Japan and China leading the pack.
def plot_medals_on_map(df, title):
= df.groupby('country_code').size().reset_index(name='count')
grouped 'country'] = grouped['country_code'].apply(lambda x: get_country_from_code(x))
grouped[= px.choropleth(grouped, locations='country_code', color='count', hover_name='country',
fig =px.colors.sequential.Blues,
color_continuous_scale=title)
title
fig.show()
'Power Sports Medals by Country') plot_medals_on_map(power_sports,
The map reinforces our earlier observation of Asia’s dominance in power sports. China and Japan, in particular, emerge as clear leaders, with a cluster of strong performers across the continent.
Endurance Sports
Shifting our focus from explosive power, we now turn our attention to endurance sports. These disciplines demand sustained physical effort over extended periods. Cycling, rowing, and triathlon epitomize the mental and physical fortitude required for success in this category. Let’s explore which countries have demonstrated exceptional endurance capabilities on the Olympic stage.
'Top 10 countries in Endurance Sports')
plot_top_10_medals(endurance_sports, 'Endurance Sports Medals by Country') plot_medals_on_map(endurance_sports,
The bar chart reveals a strong European presence in endurance sports. Great Britain, the Netherlands, and France occupy the top three positions, demonstrating the continent’s dominance in these events. While other regions have shown commendable performances, European countries have excelled in endurance-based competitions in 2024.
Speed Sports
Next, we delve into the realm of speed sports, where athletes push the boundaries of human velocity. This category encompasses a wide range of disciplines, including athletics, swimming, basketball, handball, hockey, football, and rugby. Let’s examine which countries have excelled in harnessing raw speed and agility to achieve Olympic glory.
'Top 10 countries in Speed Sports')
plot_top_10_medals(speed_sports, 'Speed Sports Medals by Country') plot_medals_on_map(speed_sports,
The bar chart unequivocally demonstrates the United States’ unparalleled dominance in speed sports. The US medal count in this category surpasses the combined total of Kenya, Great Britain, the Netherlands, and Australia, underscoring their exceptional athleticism and prowess in events demanding speed and agility. While these other nations have shown commendable performances, the US’s supremacy is undeniable.
Skill Sports
We now turn our attention to skill sports, where precision, technique, and mental focus are paramount. This category encompasses a diverse range of disciplines, including gymnastics, fencing, golf, shooting, archery, table tennis, badminton, tennis, and baseball/softball. Let’s explore which countries have excelled in these skill-intensive events.
'Top 10 countries in Skill Sports')
plot_top_10_medals(skill_sports, 'Skill Sports Medals by Country') plot_medals_on_map(skill_sports,
Skill sports showcase a different dynamic, with China emerging as the undisputed leader. The United States follows, demonstrating a strong presence in these disciplines. Korea also exceled in skill-based events. While Japan, Italy, France, and Great Britain secured respectable medal counts, the dominance of the top three nations is evident.
The dominance of China, the United States, Korea, and Japan in skill sports is likely influenced by advancements in technology and data analytics. These nations have invested heavily in sports science and technology, utilizing data-driven approaches to optimize athlete performance. Advanced training methods, biomechanical analysis, and performance tracking have become integral components of their training regimens. Moreover, these countries have access to substantial computational resources, allowing for complex data modeling and analysis to identify strengths, weaknesses, and areas for improvement. This technological edge has contributed significantly to their success in skill-based sports.
Water Sports
Turning our attention to the aquatic arena, we explore the realm of water sports. These disciplines demand a unique combination of strength, endurance, and technical skill. Let’s analyze which countries have mastered the watery challenges and emerged as dominant forces in this category.
'Top 10 countries in Water Sports')
plot_top_10_medals(water_sports, 'Water Sports Medals by Country') plot_medals_on_map(water_sports,
The United States reigns supreme in water sports, securing the top spot with a substantial medal count. Australia and China follow closely in second and third positions respectively, demonstrating exceptional aquatic prowess. Great Britain and France also excel in these disciplines, forming a competitive tier. While Hungary, Canada, Italy, Germany, and the Netherlands contribute to the overall competition, the dominance of the US, Australia, and China is evident.
Board Sports
Let’s shift our focus to the board sports. Skateboarding and surfing, with their dynamic and exhilarating nature, have captured the world’s attention. We’ll examine which countries have mastered these thrilling disciplines and claimed their place on the podium.
'Top countries in Board Sports')
plot_top_10_medals(board_sports, 'Board Sports Medals by Country') plot_medals_on_map(board_sports,
The bar chart illustrates the medal distribution for the top five countries in board sports. Japan, Brazil, and the US have all manageed to secure 4 medals, higher than any other country.
Combination Sports
The modern pentathlon represents a unique category in the Olympics, demanding a combination of diverse physical and mental abilities. Athletes in this sport must excel in swimming, fencing, show jumping, pistol shooting, and cross-country running. Let’s explore which countries have produced the most well-rounded athletes in this multifaceted competition.
'Top countries in Combination Sports')
plot_top_10_medals(combination_sports, 'Combination Sports Medals by Country') plot_medals_on_map(combination_sports,
The modern pentathlon saw a tight competition between three nations. Egypt emerged victorious with Ahmed Elgendy claiming the gold medal. Japan secured the silver, while Italy took home the bronze.
Team Dynamics
The final element we explore is the impact of team dynamics on Olympic success. Sports like basketball, volleyball, handball, hockey, football, rugby, and baseball/softball require intricate coordination, communication, and collective effort. Let’s analyze which countries have excelled in fostering cohesive and high-performing teams.
'Top countries in Team Sports')
plot_top_10_medals(team_sports, 'Team Sports Medals by Country') plot_medals_on_map(team_sports,
The bar chart highlights the top ten countries in team sports, with France, the host nation, securing the highest medal count. The United States follows closely behind, demonstrating their consistent dominance in various sporting arenas. Germany and Brazil also achieved notable success.
Conclusion
This analysis has delved into the intricate factors contributing to Olympic triumph. By examining medal distributions across various sports and exploring the relationship between physical attributes and athlete delegation size, we have gained valuable insights into the elements that propel nations to the top of the podium.
Our findings reveal the complexity of Olympic success. While raw athleticism, as exemplified by dominance in power and speed sports, is undoubtedly crucial, the significance of endurance, skill, and team cohesion cannot be overstated. The United States demonstrated exceptional prowess across multiple disciplines, highlighting the importance of a well-rounded athletic program. Conversely, nations like China and Japan excelled in specific areas, emphasizing the potential benefits of specialization.
This analysis provides only an overview, further research is needed to delve deeper into specific sports, athlete demographics, and the long-term implications of training methodologies. By understanding the multifaceted nature of Olympic success, we can gain valuable knowledge that can be applied to enhance athletic development and performance on both national and individual levels.
Ultimately, through this exploration, we aspire to have captured the essence of the Olympic spirit itself - dedication, perseverance, teamwork, and the relentless pursuit of excellence. By analyzing the data, we hope to inspire future generations of athletes and contribute to the ongoing evolution of sports performance.
Data Source
This analysis utilizes a comprehensive dataset on the 2024 Olympic Games retrieved from Kaggle. The dataset encompasses detailed information on medal winners, including athlete names, nationalities, sporting events, and medal types.