What Statistics Can Tell Us About NBA Coaches

شارك رابطًا

2025-05-22 18:06:37 ·

Who gets hired as an NBA coach? How long does a typical coach last? And does their coaching background play any part in predicting success?

This analysis was inspired by several key theories. First, there has been a common criticism among casual NBA fans that teams overly prefer hiring candidates with previous NBA head coaches experience.

Consequently, this analysis aims to answer two related questions. First, is it true that NBA teams frequently re-hire candidates with previous head coaching experience? And second, is there any evidence that these candidates under-perform relative to other candidates?

The second theory is that internal candidatesare often more successful than external candidates. This theory was derived from a pair of anecdotes. Two of the most successful coaches in NBA history, Gregg Popovich of San Antonio and Erik Spoelstra of Miami, were both internal hires. However, rigorous quantitative evidence is needed to test if this relationship holds over a larger sample.

This analysis aims to explore these questions, and provide the code to reproduce the analysis in Python.

The Data

The codeand dataset for this project are available on Github here. The analysis was performed using Python in Google Colaboratory.

A prerequisite to this analysis was determining a way to measure coaching success quantitatively. I decided on a simple idea: the success of a coach would be best measured by the length of their tenure in that job. Tenure best represents the differing expectations that might be placed on a coach. A coach hired to a contending team would be expected to win games and generate deep playoff runs. A coach hired to a rebuilding team might be judged on the development of younger players and their ability to build a strong culture. If a coach meets expectations, the team will keep them around.

Since there was no existing dataset with all of the required data, I collected the data myself from Wikipedia. I recorded every off-season coaching change from 1990 through 2021. Since the primary outcome variable is tenure, in-season coaching changes were excluded since these coaches often carried an “interim” tag—meaning they were intended to be temporary until a permanent replacement could be found.

In addition, the following variables were collected:

VariableDefinitionTeamThe NBA team the coach was hired forYearThe year the coach was hiredCoachThe name of the coachInternal?An indicator if the coach was internal or not—meaning they worked for the organization in some capacity immediately prior to being hired as head coachTypeThe background of the coach. Categories are Previous HC, Previous AC, College, Player, Management, and Foreign.YearsThe number of years a coach was employed in the role. For coaches fired mid-season, the value was counted as 0.5.

First, the dataset is imported from its location in Google Drive. I also convert ‘Internal?’ into a dummy variable, replacing “Yes” with 1 and “No” with 0.

from google.colab import drive
drive.mountimport pandas as pd
pd.set_option#Bring in the dataset
coach = pd.read_csv.iloccoach= coach.map)
coach

This prints a preview of what the dataset looks like:

In total, the dataset contains 221 coaching hires over this time.

Descriptive Statistics

First, basic summary Statistics are calculated and visualized to determine the backgrounds of NBA head coaches.

#Create chart of coaching background
import matplotlib.pyplot as plt

#Count number of coaches per category
counts = coach.value_counts#Create chart
plt.barplt.titleplt.figtextplt.xticksplt.ylabelplt.gca.spines.set_visibleplt.gca.spines.set_visiblefor i, value in enumerate:
plt.text)*100,1)) + '%' + '+ ')', ha='center', fontsize=9)
plt.savefigprint.sum/len)*100,1)) + " percent of coaches are internal.")

Over half of coaching hires previously served as an NBA head coach, and nearly 90% had NBA coaching experience of some kind. This answers the first question posed—NBA teams show a strong preference for experienced head coaches. If you get hired once as an NBA coach, your odds of being hired again are much higher. Additionally, 13.6% of hires are internal, confirming that teams do not frequently hire from their own ranks.

Second, I will explore the typical tenure of an NBA head coach. This can be visualized using a histogram.

#Create histogram
plt.histplt.titleplt.figtextplt.annotate', xy=, xytext=,
arrowprops=dict, fontsize=9, color='black')
plt.gca.spines.set_visibleplt.gca.spines.set_visibleplt.savefigplt.showcoach.sort_values#Calculate some stats with the data
import numpy as np

print) + " years is the median coaching tenure length.")
print.sum/len)*100,1)) + " percent of coaches last five years or less.")
print.sum/len*100,1)) + " percent of coaches last a year or less.")

Using tenure as an indicator of success, the the data clearly shows that the large majority of coaches are unsuccessful. The median tenure is just 2.5 seasons. 18.1% of coaches last a single season or less, and barely 10% of coaches last more than 5 seasons.

This can also be viewed as a survival analysis plot to see the drop-off at various points in time:

#Survival analysis
import matplotlib.ticker as mtick

lst = np.arangesurv = pd.DataFramesurv= np.nan

for i in range):
surv.iloc=.sum/lenplt.stepplt.titleplt.xlabel')
plt.figtextplt.gca.yaxis.set_major_formatter)
plt.gca.spines.set_visibleplt.gca.spines.set_visibleplt.savefigplt.show

Lastly, a box plot can be generated to see if there are any obvious differences in tenure based on coaching type. Boxplots also display outliers for each group.

#Create a boxplot
import seaborn as sns

sns.boxplotplt.titleplt.gca.spines.set_visibleplt.gca.spines.set_visibleplt.xlabelplt.xticksplt.figtextplt.savefigplt.show

There are some differences between the groups. Aside from management hires, previous head coaches have the longest average tenure at 3.3 years. However, since many of the groups have small sample sizes, we need to use more advanced techniques to test if the differences are statistically significant.

Statistical Analysis

First, to test if either Type or Internal has a statistically significant difference among the group means, we can use ANOVA:

#ANOVA
import statsmodels.api as sm
from statsmodels.formula.api import ols

am = ols+ C', data=coach).fitanova_table = sm.stats.anova_lmprintThe results show high p-values and low F-stats—indicating no evidence of statistically significant difference in means. Thus, the initial conclusion is that there is no evidence NBA teams are under-valuing internal candidates or over-valuing previous head coaching experience as initially hypothesized.

However, there is a possible distortion when comparing group averages. NBA coaches are signed to contracts that typically run between three and five years. Teams typically have to pay out the remainder of the contract even if coaches are dismissed early for poor performance. A coach that lasts two years may be no worse than one that lasts three or four years—the difference could simply be attributable to the length and terms of the initial contract, which is in turn impacted by the desirability of the coach in the job market. Since coaches with prior experience are highly coveted, they may use that leverage to negotiate longer contracts and/or higher salaries, both of which could deter teams from terminating their employment too early.

To account for this possibility, the outcome can be treated as binary rather than continuous. If a coach lasted more than 5 seasons, it is highly likely they completed at least their initial contract term and the team chose to extend or re-sign them. These coaches will be treated as successes, with those having a tenure of five years or less categorized as unsuccessful. To run this analysis, all coaching hires from 2020 and 2021 must be excluded, since they have not yet been able to eclipse 5 seasons.

With a binary dependent variable, a logistic regression can be used to test if any of the variables predict coaching success. Internal and Type are both converted to dummy variables. Since previous head coaches represent the most common coaching hires, I set this as the “reference” category against which the others will be measured against. Additionally, the dataset contains just one foreign-hired coachso this observation is dropped from the analysis.

#Logistic regression
coach3 = coach<2020]

coach3.loc= np.wherecoach_type_dummies = pd.get_dummies.astypecoach_type_dummies.dropcoach3 = pd.concat#Drop foreign category / David Blatt since n = 1
coach3 = coach3.dropcoach3 = coach3.loc!= "David Blatt"]

print)

x = coach3]
x = sm.add_constanty = coach3logm = sm.Logitlogm.r = logm.fitprint)

#Convert coefficients to odds ratio
print) + "is the odds ratio for internal.") #Internal coefficient
print) #Management
print) #Player
print) #Previous AC
print) #College

Consistent with ANOVA results, none of the variables are statistically significant under any conventional threshold. However, closer examination of the coefficients tells an interesting story.

The beta coefficients represent the change in the log-odds of the outcome. Since this is unintuitive to interpret, the coefficients can be converted to an Odds Ratio as follows:

Internal has an odds ratio of 0.23—indicating that internal candidates are 77% less likely to be successful compared to external candidates. Management has an odds ratio of 2.725, indicating these candidates are 172.5% more likely to be successful. The odds ratios for players is effectively zero, 0.696 for previous assistant coaches, and 0.5 for college coaches. Since three out of four coaching type dummy variables have an odds ratio under one, this indicates that only management hires were more likely to be successful than previous head coaches.

From a practical standpoint, these are large effect sizes. So why are the variables statistically insignificant?

The cause is a limited sample size of successful coaches. Out of 202 coaches remaining in the sample, just 23were successful. Regardless of the coach’s background, odds are low they last more than a few seasons. If we look at the one category able to outperform previous head coachesspecifically:

# Filter to management

manage = coach3== 1]
print)
printThe filtered dataset contains just 6 hires—of which just oneis classified as a success. In other words, the entire effect was driven by a single successful observation. Thus, it would take a considerably larger sample size to be confident if differences exist.

With a p-value of 0.202, the Internal variable comes the closest to statistical significance. Notably, however, the direction of the effect is actually the opposite of what was hypothesized—internal hires are less likely to be successful than external hires. Out of 26 internal hires, just onemet the criteria for success.

Conclusion

In conclusion, this analysis was able to draw several key conclusions:

Regardless of background, being an NBA coach is typically a short-lived job. It’s rare for a coach to last more than a few seasons.

The common wisdom that NBA teams strongly prefer to hire previous head coaches holds true. More than half of hires already had NBA head coaching experience.

If teams don’t hire an experienced head coach, they’re likely to hire an NBA assistant coach. Hires outside of these two categories are especially uncommon.

Though they are frequently hired, there is no evidence to suggest NBA teams overly prioritize previous head coaches. To the contrary, previous head coaches stay in the job longer on average and are more likely to outlast their initial contract term—though neither of these differences are statistically significant.

Despite high-profile anecdotes, there is no evidence to suggest that internal hires are more successful than external hires either.

Note: All images were created by the author unless otherwise credited.
The post What Statistics Can Tell Us About NBA Coaches appeared first on Towards Data Science.
#what #statistics #can #tell #about

What Statistics Can Tell Us About NBA Coaches

Who gets hired as an NBA coach? How long does a typical coach last? And does their coaching background play any part in predicting success? This analysis was inspired by several key theories. First, there has been a common criticism among casual NBA fans that teams overly prefer hiring candidates with previous NBA head coaches experience. Consequently, this analysis aims to answer two related questions. First, is it true that NBA teams frequently re-hire candidates with previous head coaching experience? And second, is there any evidence that these candidates under-perform relative to other candidates? The second theory is that internal candidatesare often more successful than external candidates. This theory was derived from a pair of anecdotes. Two of the most successful coaches in NBA history, Gregg Popovich of San Antonio and Erik Spoelstra of Miami, were both internal hires. However, rigorous quantitative evidence is needed to test if this relationship holds over a larger sample. This analysis aims to explore these questions, and provide the code to reproduce the analysis in Python. The Data The codeand dataset for this project are available on Github here. The analysis was performed using Python in Google Colaboratory. A prerequisite to this analysis was determining a way to measure coaching success quantitatively. I decided on a simple idea: the success of a coach would be best measured by the length of their tenure in that job. Tenure best represents the differing expectations that might be placed on a coach. A coach hired to a contending team would be expected to win games and generate deep playoff runs. A coach hired to a rebuilding team might be judged on the development of younger players and their ability to build a strong culture. If a coach meets expectations, the team will keep them around. Since there was no existing dataset with all of the required data, I collected the data myself from Wikipedia. I recorded every off-season coaching change from 1990 through 2021. Since the primary outcome variable is tenure, in-season coaching changes were excluded since these coaches often carried an “interim” tag—meaning they were intended to be temporary until a permanent replacement could be found. In addition, the following variables were collected: VariableDefinitionTeamThe NBA team the coach was hired forYearThe year the coach was hiredCoachThe name of the coachInternal?An indicator if the coach was internal or not—meaning they worked for the organization in some capacity immediately prior to being hired as head coachTypeThe background of the coach. Categories are Previous HC, Previous AC, College, Player, Management, and Foreign.YearsThe number of years a coach was employed in the role. For coaches fired mid-season, the value was counted as 0.5. First, the dataset is imported from its location in Google Drive. I also convert ‘Internal?’ into a dummy variable, replacing “Yes” with 1 and “No” with 0. from google.colab import drive drive.mountimport pandas as pd pd.set_option#Bring in the dataset coach = pd.read_csv.iloccoach= coach.map) coach This prints a preview of what the dataset looks like: In total, the dataset contains 221 coaching hires over this time. Descriptive Statistics First, basic summary Statistics are calculated and visualized to determine the backgrounds of NBA head coaches. #Create chart of coaching background import matplotlib.pyplot as plt #Count number of coaches per category counts = coach.value_counts#Create chart plt.barplt.titleplt.figtextplt.xticksplt.ylabelplt.gca.spines.set_visibleplt.gca.spines.set_visiblefor i, value in enumerate: plt.text)*100,1)) + '%' + '+ ')', ha='center', fontsize=9) plt.savefigprint.sum/len)*100,1)) + " percent of coaches are internal.") Over half of coaching hires previously served as an NBA head coach, and nearly 90% had NBA coaching experience of some kind. This answers the first question posed—NBA teams show a strong preference for experienced head coaches. If you get hired once as an NBA coach, your odds of being hired again are much higher. Additionally, 13.6% of hires are internal, confirming that teams do not frequently hire from their own ranks. Second, I will explore the typical tenure of an NBA head coach. This can be visualized using a histogram. #Create histogram plt.histplt.titleplt.figtextplt.annotate', xy=, xytext=, arrowprops=dict, fontsize=9, color='black') plt.gca.spines.set_visibleplt.gca.spines.set_visibleplt.savefigplt.showcoach.sort_values#Calculate some stats with the data import numpy as np print) + " years is the median coaching tenure length.") print.sum/len)*100,1)) + " percent of coaches last five years or less.") print.sum/len*100,1)) + " percent of coaches last a year or less.") Using tenure as an indicator of success, the the data clearly shows that the large majority of coaches are unsuccessful. The median tenure is just 2.5 seasons. 18.1% of coaches last a single season or less, and barely 10% of coaches last more than 5 seasons. This can also be viewed as a survival analysis plot to see the drop-off at various points in time: #Survival analysis import matplotlib.ticker as mtick lst = np.arangesurv = pd.DataFramesurv= np.nan for i in range): surv.iloc=.sum/lenplt.stepplt.titleplt.xlabel') plt.figtextplt.gca.yaxis.set_major_formatter) plt.gca.spines.set_visibleplt.gca.spines.set_visibleplt.savefigplt.show Lastly, a box plot can be generated to see if there are any obvious differences in tenure based on coaching type. Boxplots also display outliers for each group. #Create a boxplot import seaborn as sns sns.boxplotplt.titleplt.gca.spines.set_visibleplt.gca.spines.set_visibleplt.xlabelplt.xticksplt.figtextplt.savefigplt.show There are some differences between the groups. Aside from management hires, previous head coaches have the longest average tenure at 3.3 years. However, since many of the groups have small sample sizes, we need to use more advanced techniques to test if the differences are statistically significant. Statistical Analysis First, to test if either Type or Internal has a statistically significant difference among the group means, we can use ANOVA: #ANOVA import statsmodels.api as sm from statsmodels.formula.api import ols am = ols+ C', data=coach).fitanova_table = sm.stats.anova_lmprintThe results show high p-values and low F-stats—indicating no evidence of statistically significant difference in means. Thus, the initial conclusion is that there is no evidence NBA teams are under-valuing internal candidates or over-valuing previous head coaching experience as initially hypothesized. However, there is a possible distortion when comparing group averages. NBA coaches are signed to contracts that typically run between three and five years. Teams typically have to pay out the remainder of the contract even if coaches are dismissed early for poor performance. A coach that lasts two years may be no worse than one that lasts three or four years—the difference could simply be attributable to the length and terms of the initial contract, which is in turn impacted by the desirability of the coach in the job market. Since coaches with prior experience are highly coveted, they may use that leverage to negotiate longer contracts and/or higher salaries, both of which could deter teams from terminating their employment too early. To account for this possibility, the outcome can be treated as binary rather than continuous. If a coach lasted more than 5 seasons, it is highly likely they completed at least their initial contract term and the team chose to extend or re-sign them. These coaches will be treated as successes, with those having a tenure of five years or less categorized as unsuccessful. To run this analysis, all coaching hires from 2020 and 2021 must be excluded, since they have not yet been able to eclipse 5 seasons. With a binary dependent variable, a logistic regression can be used to test if any of the variables predict coaching success. Internal and Type are both converted to dummy variables. Since previous head coaches represent the most common coaching hires, I set this as the “reference” category against which the others will be measured against. Additionally, the dataset contains just one foreign-hired coachso this observation is dropped from the analysis. #Logistic regression coach3 = coach<2020] coach3.loc= np.wherecoach_type_dummies = pd.get_dummies.astypecoach_type_dummies.dropcoach3 = pd.concat#Drop foreign category / David Blatt since n = 1 coach3 = coach3.dropcoach3 = coach3.loc!= "David Blatt"] print) x = coach3] x = sm.add_constanty = coach3logm = sm.Logitlogm.r = logm.fitprint) #Convert coefficients to odds ratio print) + "is the odds ratio for internal.") #Internal coefficient print) #Management print) #Player print) #Previous AC print) #College Consistent with ANOVA results, none of the variables are statistically significant under any conventional threshold. However, closer examination of the coefficients tells an interesting story. The beta coefficients represent the change in the log-odds of the outcome. Since this is unintuitive to interpret, the coefficients can be converted to an Odds Ratio as follows: Internal has an odds ratio of 0.23—indicating that internal candidates are 77% less likely to be successful compared to external candidates. Management has an odds ratio of 2.725, indicating these candidates are 172.5% more likely to be successful. The odds ratios for players is effectively zero, 0.696 for previous assistant coaches, and 0.5 for college coaches. Since three out of four coaching type dummy variables have an odds ratio under one, this indicates that only management hires were more likely to be successful than previous head coaches. From a practical standpoint, these are large effect sizes. So why are the variables statistically insignificant? The cause is a limited sample size of successful coaches. Out of 202 coaches remaining in the sample, just 23were successful. Regardless of the coach’s background, odds are low they last more than a few seasons. If we look at the one category able to outperform previous head coachesspecifically: # Filter to management manage = coach3== 1] print) printThe filtered dataset contains just 6 hires—of which just oneis classified as a success. In other words, the entire effect was driven by a single successful observation. Thus, it would take a considerably larger sample size to be confident if differences exist. With a p-value of 0.202, the Internal variable comes the closest to statistical significance. Notably, however, the direction of the effect is actually the opposite of what was hypothesized—internal hires are less likely to be successful than external hires. Out of 26 internal hires, just onemet the criteria for success. Conclusion In conclusion, this analysis was able to draw several key conclusions: Regardless of background, being an NBA coach is typically a short-lived job. It’s rare for a coach to last more than a few seasons. The common wisdom that NBA teams strongly prefer to hire previous head coaches holds true. More than half of hires already had NBA head coaching experience. If teams don’t hire an experienced head coach, they’re likely to hire an NBA assistant coach. Hires outside of these two categories are especially uncommon. Though they are frequently hired, there is no evidence to suggest NBA teams overly prioritize previous head coaches. To the contrary, previous head coaches stay in the job longer on average and are more likely to outlast their initial contract term—though neither of these differences are statistically significant. Despite high-profile anecdotes, there is no evidence to suggest that internal hires are more successful than external hires either. Note: All images were created by the author unless otherwise credited. The post What Statistics Can Tell Us About NBA Coaches appeared first on Towards Data Science. #what #statistics #can #tell #about

TOWARDSDATASCIENCE.COM

What Statistics Can Tell Us About NBA Coaches

Who gets hired as an NBA coach? How long does a typical coach last? And does their coaching background play any part in predicting success? This analysis was inspired by several key theories. First, there has been a common criticism among casual NBA fans that teams overly prefer hiring candidates with previous NBA head coaches experience. Consequently, this analysis aims to answer two related questions. First, is it true that NBA teams frequently re-hire candidates with previous head coaching experience? And second, is there any evidence that these candidates under-perform relative to other candidates? The second theory is that internal candidates (though infrequently hired) are often more successful than external candidates. This theory was derived from a pair of anecdotes. Two of the most successful coaches in NBA history, Gregg Popovich of San Antonio and Erik Spoelstra of Miami, were both internal hires. However, rigorous quantitative evidence is needed to test if this relationship holds over a larger sample. This analysis aims to explore these questions, and provide the code to reproduce the analysis in Python. The Data The code (contained in a Jupyter notebook) and dataset for this project are available on Github here. The analysis was performed using Python in Google Colaboratory. A prerequisite to this analysis was determining a way to measure coaching success quantitatively. I decided on a simple idea: the success of a coach would be best measured by the length of their tenure in that job. Tenure best represents the differing expectations that might be placed on a coach. A coach hired to a contending team would be expected to win games and generate deep playoff runs. A coach hired to a rebuilding team might be judged on the development of younger players and their ability to build a strong culture. If a coach meets expectations (whatever those may be), the team will keep them around. Since there was no existing dataset with all of the required data, I collected the data myself from Wikipedia. I recorded every off-season coaching change from 1990 through 2021. Since the primary outcome variable is tenure, in-season coaching changes were excluded since these coaches often carried an “interim” tag—meaning they were intended to be temporary until a permanent replacement could be found. In addition, the following variables were collected: VariableDefinitionTeamThe NBA team the coach was hired forYearThe year the coach was hiredCoachThe name of the coachInternal?An indicator if the coach was internal or not—meaning they worked for the organization in some capacity immediately prior to being hired as head coachTypeThe background of the coach. Categories are Previous HC (prior NBA head coaching experience), Previous AC (prior NBA assistant coaching experience, but no head coaching experience), College (head coach of a college team), Player (a former NBA player with no coaching experience), Management (someone with front office experience but no coaching experience), and Foreign (someone coaching outside of North America with no NBA coaching experience).YearsThe number of years a coach was employed in the role. For coaches fired mid-season, the value was counted as 0.5. First, the dataset is imported from its location in Google Drive. I also convert ‘Internal?’ into a dummy variable, replacing “Yes” with 1 and “No” with 0. from google.colab import drive drive.mount('/content/drive') import pandas as pd pd.set_option('display.max_columns', None) #Bring in the dataset coach = pd.read_csv('/content/drive/MyDrive/Python_Files/Coaches.csv', on_bad_lines = 'skip').iloc[:,0:6] coach['Internal'] = coach['Internal?'].map(dict(Yes=1, No=0)) coach This prints a preview of what the dataset looks like: In total, the dataset contains 221 coaching hires over this time. Descriptive Statistics First, basic summary Statistics are calculated and visualized to determine the backgrounds of NBA head coaches. #Create chart of coaching background import matplotlib.pyplot as plt #Count number of coaches per category counts = coach['Type'].value_counts() #Create chart plt.bar(counts.index, counts.values, color = 'blue', edgecolor = 'black') plt.title('Where Do NBA Coaches Come From?') plt.figtext(0.76, -0.1, "Made by Brayden Gerrard", ha="center") plt.xticks(rotation = 45) plt.ylabel('Number of Coaches') plt.gca().spines['top'].set_visible(False) plt.gca().spines['right'].set_visible(False) for i, value in enumerate(counts.values): plt.text(i, value + 1, str(round((value/sum(counts.values))*100,1)) + '%' + ' (' + str(value) + ')', ha='center', fontsize=9) plt.savefig('coachtype.png', bbox_inches = 'tight') print(str(round(((coach['Internal'] == 1).sum()/len(coach))*100,1)) + " percent of coaches are internal.") Over half of coaching hires previously served as an NBA head coach, and nearly 90% had NBA coaching experience of some kind. This answers the first question posed—NBA teams show a strong preference for experienced head coaches. If you get hired once as an NBA coach, your odds of being hired again are much higher. Additionally, 13.6% of hires are internal, confirming that teams do not frequently hire from their own ranks. Second, I will explore the typical tenure of an NBA head coach. This can be visualized using a histogram. #Create histogram plt.hist(coach['Years'], bins =12, edgecolor = 'black', color = 'blue') plt.title('Distribution of Coaching Tenure') plt.figtext(0.76, 0, "Made by Brayden Gerrard", ha="center") plt.annotate('Erik Spoelstra (MIA)', xy=(16.4, 2), xytext=(14 + 1, 15), arrowprops=dict(facecolor='black', shrink=0.1), fontsize=9, color='black') plt.gca().spines['top'].set_visible(False) plt.gca().spines['right'].set_visible(False) plt.savefig('tenurehist.png', bbox_inches = 'tight') plt.show() coach.sort_values('Years', ascending = False) #Calculate some stats with the data import numpy as np print(str(np.median(coach['Years'])) + " years is the median coaching tenure length.") print(str(round(((coach['Years'] <= 5).sum()/len(coach))*100,1)) + " percent of coaches last five years or less.") print(str(round((coach['Years'] <= 1).sum()/len(coach)*100,1)) + " percent of coaches last a year or less.") Using tenure as an indicator of success, the the data clearly shows that the large majority of coaches are unsuccessful. The median tenure is just 2.5 seasons. 18.1% of coaches last a single season or less, and barely 10% of coaches last more than 5 seasons. This can also be viewed as a survival analysis plot to see the drop-off at various points in time: #Survival analysis import matplotlib.ticker as mtick lst = np.arange(0,18,0.5) surv = pd.DataFrame(lst, columns = ['Period']) surv['Number'] = np.nan for i in range(0,len(surv)): surv.iloc[i,1] = (coach['Years'] >= surv.iloc[i,0]).sum()/len(coach) plt.step(surv['Period'],surv['Number']) plt.title('NBA Coach Survival Rate') plt.xlabel('Coaching Tenure (Years)') plt.figtext(0.76, -0.05, "Made by Brayden Gerrard", ha="center") plt.gca().yaxis.set_major_formatter(mtick.PercentFormatter(1)) plt.gca().spines['top'].set_visible(False) plt.gca().spines['right'].set_visible(False) plt.savefig('coachsurvival.png', bbox_inches = 'tight') plt.show Lastly, a box plot can be generated to see if there are any obvious differences in tenure based on coaching type. Boxplots also display outliers for each group. #Create a boxplot import seaborn as sns sns.boxplot(data=coach, x='Type', y='Years') plt.title('Coaching Tenure by Coach Type') plt.gca().spines['top'].set_visible(False) plt.gca().spines['right'].set_visible(False) plt.xlabel('') plt.xticks(rotation = 30, ha = 'right') plt.figtext(0.76, -0.1, "Made by Brayden Gerrard", ha="center") plt.savefig('coachtypeboxplot.png', bbox_inches = 'tight') plt.show There are some differences between the groups. Aside from management hires (which have a sample of just six), previous head coaches have the longest average tenure at 3.3 years. However, since many of the groups have small sample sizes, we need to use more advanced techniques to test if the differences are statistically significant. Statistical Analysis First, to test if either Type or Internal has a statistically significant difference among the group means, we can use ANOVA: #ANOVA import statsmodels.api as sm from statsmodels.formula.api import ols am = ols('Years ~ C(Type) + C(Internal)', data=coach).fit() anova_table = sm.stats.anova_lm(am, typ=2) print(anova_table) The results show high p-values and low F-stats—indicating no evidence of statistically significant difference in means. Thus, the initial conclusion is that there is no evidence NBA teams are under-valuing internal candidates or over-valuing previous head coaching experience as initially hypothesized. However, there is a possible distortion when comparing group averages. NBA coaches are signed to contracts that typically run between three and five years. Teams typically have to pay out the remainder of the contract even if coaches are dismissed early for poor performance. A coach that lasts two years may be no worse than one that lasts three or four years—the difference could simply be attributable to the length and terms of the initial contract, which is in turn impacted by the desirability of the coach in the job market. Since coaches with prior experience are highly coveted, they may use that leverage to negotiate longer contracts and/or higher salaries, both of which could deter teams from terminating their employment too early. To account for this possibility, the outcome can be treated as binary rather than continuous. If a coach lasted more than 5 seasons, it is highly likely they completed at least their initial contract term and the team chose to extend or re-sign them. These coaches will be treated as successes, with those having a tenure of five years or less categorized as unsuccessful. To run this analysis, all coaching hires from 2020 and 2021 must be excluded, since they have not yet been able to eclipse 5 seasons. With a binary dependent variable, a logistic regression can be used to test if any of the variables predict coaching success. Internal and Type are both converted to dummy variables. Since previous head coaches represent the most common coaching hires, I set this as the “reference” category against which the others will be measured against. Additionally, the dataset contains just one foreign-hired coach (David Blatt) so this observation is dropped from the analysis. #Logistic regression coach3 = coach[coach['Year']<2020] coach3.loc[:, 'Success'] = np.where(coach3['Years'] > 5, 1, 0) coach_type_dummies = pd.get_dummies(coach3['Type'], prefix = 'Type').astype(int) coach_type_dummies.drop(columns=['Type_Previous HC'], inplace=True) coach3 = pd.concat([coach3, coach_type_dummies], axis = 1) #Drop foreign category / David Blatt since n = 1 coach3 = coach3.drop(columns=['Type_Foreign']) coach3 = coach3.loc[coach3['Coach'] != "David Blatt"] print(coach3['Success'].value_counts()) x = coach3[['Internal','Type_Management','Type_Player','Type_Previous AC', 'Type_College']] x = sm.add_constant(x) y = coach3['Success'] logm = sm.Logit(y,x) logm.r = logm.fit(maxiter=1000) print(logm.r.summary()) #Convert coefficients to odds ratio print(str(np.exp(-1.4715)) + "is the odds ratio for internal.") #Internal coefficient print(np.exp(1.0025)) #Management print(np.exp(-39.6956)) #Player print(np.exp(-0.3626)) #Previous AC print(np.exp(-0.6901)) #College Consistent with ANOVA results, none of the variables are statistically significant under any conventional threshold. However, closer examination of the coefficients tells an interesting story. The beta coefficients represent the change in the log-odds of the outcome. Since this is unintuitive to interpret, the coefficients can be converted to an Odds Ratio as follows: Internal has an odds ratio of 0.23—indicating that internal candidates are 77% less likely to be successful compared to external candidates. Management has an odds ratio of 2.725, indicating these candidates are 172.5% more likely to be successful. The odds ratios for players is effectively zero, 0.696 for previous assistant coaches, and 0.5 for college coaches. Since three out of four coaching type dummy variables have an odds ratio under one, this indicates that only management hires were more likely to be successful than previous head coaches. From a practical standpoint, these are large effect sizes. So why are the variables statistically insignificant? The cause is a limited sample size of successful coaches. Out of 202 coaches remaining in the sample, just 23 (11.4%) were successful. Regardless of the coach’s background, odds are low they last more than a few seasons. If we look at the one category able to outperform previous head coaches (management hires) specifically: # Filter to management manage = coach3[coach3['Type_Management'] == 1] print(manage['Success'].value_counts()) print(manage) The filtered dataset contains just 6 hires—of which just one (Steve Kerr with Golden State) is classified as a success. In other words, the entire effect was driven by a single successful observation. Thus, it would take a considerably larger sample size to be confident if differences exist. With a p-value of 0.202, the Internal variable comes the closest to statistical significance (though it still falls well short of a typical alpha of 0.05). Notably, however, the direction of the effect is actually the opposite of what was hypothesized—internal hires are less likely to be successful than external hires. Out of 26 internal hires, just one (Erik Spoelstra of Miami) met the criteria for success. Conclusion In conclusion, this analysis was able to draw several key conclusions: Regardless of background, being an NBA coach is typically a short-lived job. It’s rare for a coach to last more than a few seasons. The common wisdom that NBA teams strongly prefer to hire previous head coaches holds true. More than half of hires already had NBA head coaching experience. If teams don’t hire an experienced head coach, they’re likely to hire an NBA assistant coach. Hires outside of these two categories are especially uncommon. Though they are frequently hired, there is no evidence to suggest NBA teams overly prioritize previous head coaches. To the contrary, previous head coaches stay in the job longer on average and are more likely to outlast their initial contract term—though neither of these differences are statistically significant. Despite high-profile anecdotes, there is no evidence to suggest that internal hires are more successful than external hires either. Note: All images were created by the author unless otherwise credited. The post What Statistics Can Tell Us About NBA Coaches appeared first on Towards Data Science.

·117 مشاهدة

انضم إلينا

اللغات

What Statistics Can Tell Us About NBA Coaches