Lessons in Decision Making from the Monty Hall Problem
The Monty Hall Problem is a well-known brain teaser from which we can learn important lessons in Decision Making that are useful in general and in particular for data scientists.
If you are not familiar with this problem, prepare to be perplexed . If you are, I hope to shine light on aspects that you might not have considered .
I introduce the problem and solve with three types of intuitions:
Common — The heart of this post focuses on applying our common sense to solve this problem. We’ll explore why it fails us and what we can do to intuitively overcome this to make the solution crystal clear . We’ll do this by using visuals , qualitative arguments and some basic probabilities.
Bayesian — We will briefly discuss the importance of belief propagation.
Causal — We will use a Graph Model to visualise conditions required to use the Monty Hall problem in real world settings.Spoiler alert I haven’t been convinced that there are any, but the thought process is very useful.
I summarise by discussing lessons learnt for better data decision making.
In regards to the Bayesian and Causal intuitions, these will be presented in a gentle form. For the mathematically inclined I also provide supplementary sections with short Deep Dives into each approach after the summary.By examining different aspects of this puzzle in probability you will hopefully be able to improve your data decision making .
Credit: Wikipedia
First, some history. Let’s Make a Deal is a USA television game show that originated in 1963. As its premise, audience participants were considered traders making deals with the host, Monty Hall .
At the heart of the matter is an apparently simple scenario:
A trader is posed with the question of choosing one of three doors for the opportunity to win a luxurious prize, e.g, a car . Behind the other two were goats .
The trader is shown three closed doors.
The trader chooses one of the doors. Let’s call thisdoor A and mark it with a .
Keeping the chosen door closed, the host reveals one of the remaining doors showing a goat.
The trader chooses door and the the host reveals door C showing a goat.
The host then asks the trader if they would like to stick with their first choice or switch to the other remaining one.
If the trader guesses correct they win the prize . If not they’ll be shown another goat.
What is the probability of being Zonked? Credit: Wikipedia
Should the trader stick with their original choice of door A or switch to B?
Before reading further, give it a go. What would you do?
Most people are likely to have a gut intuition that “it doesn’t matter” arguing that in the first instance each door had a ⅓ chance of hiding the prize, and that after the host intervention , when only two doors remain closed, the winning of the prize is 50:50.
There are various ways of explaining why the coin toss intuition is incorrect. Most of these involve maths equations, or simulations. Whereas we will address these later, we’ll attempt to solve by applying Occam’s razor:
A principle that states that simpler explanations are preferable to more complex ones — William of OckhamTo do this it is instructive to slightly redefine the problem to a large N doors instead of the original three.
The Large N-Door Problem
Similar to before: you have to choose one of many doors. For illustration let’s say N=100. Behind one of the doors there is the prize and behind 99of the rest are goats .
The 100 Door Monty Hall problem before the host intervention.
You choose one door and the host reveals 98of the other doors that have goats leaving yours and one more closed .
The 100 Door Monty Hall Problem after the host intervention. Should you stick with your door or make the switch?
Should you stick with your original choice or make the switch?
I think you’ll agree with me that the remaining door, not chosen by you, is much more likely to conceal the prize … so you should definitely make the switch!
It’s illustrative to compare both scenarios discussed so far. In the next figure we compare the post host intervention for the N=3 setupand that of N=100:
Post intervention settings for the N=3 setupand N=100.
In both cases we see two shut doors, one of which we’ve chosen. The main difference between these scenarios is that in the first we see one goat and in the second there are more than the eye would care to see.
Why do most people consider the first case as a “50:50” toss up and in the second it’s obvious to make the switch?
We’ll soon address this question of why. First let’s put probabilities of success behind the different scenarios.
What’s The Frequency, Kenneth?
So far we learnt from the N=100 scenario that switching doors is obviously beneficial. Inferring for the N=3 may be a leap of faith for most. Using some basic probability arguments here we’ll quantify why it is favourable to make the switch for any number door scenario N.
We start with the standard Monty Hall problem. When it starts the probability of the prize being behind each of the doors A, B and C is p=⅓. To be explicit let’s define the Y parameter to be the door with the prize , i.e, p= p=p=⅓.
The trick to solving this problem is that once the trader’s door A has been chosen , we should pay close attention to the set of the other doors {B,C}, which has the probability of p=p+p=⅔. This visual may help make sense of this:
By being attentive to the {B,C} the rest should follow. When the goat is revealed
it is apparent that the probabilities post intervention change. Note that for ease of reading I’ll drop the Y notation, where pwill read pand pwill read p. Also for completeness the full terms after the intervention should be even longer due to it being conditional, e.g, p, p, where Z is a parameter representing the choice of the host .premains ⅓
p=p+premains ⅔,
p=0; we just learnt that the goat is behind door C, not the prize.
p= p-p= ⅔
For anyone with the information provided by the hostthis means that it isn’t a toss of a fair coin! For them the fact that pbecame zero does not “raise all other boats”, but rather premains the same and pgets doubled.
The bottom line is that the trader should consider p= ⅓ and p=⅔, hence by switching they are doubling the odds at winning!
Let’s generalise to N.
When we start all doors have odds of winning the prize p=1/N. After the trader chooses one door which we’ll call D₁, meaning p=1/N, we should now pay attention to the remaining set of doors {D₂, …, Dₙ} will have a chance of p=/N.
When the host revealsdoors {D₃, …, Dₙ} with goats:
premains 1/N
p=p+p+… + premains/N
p=p= …=p=p= 0; we just learnt that they have goats, not the prize.
p=p— p— … — p=/N
The trader should now consider two door values p=1/N and p=/N.
Hence the odds of winning improved by a factor of N-1! In the case of N=100, this means by an odds ratio of 99!.
The improvement of odds ratios in all scenarios between N=3 to 100 may be seen in the following graph. The thin line is the probability of winning by choosing any door prior to the intervention p=1/N. Note that it also represents the chance of winning after the intervention, if they decide to stick to their guns and not switch p.The thick line is the probability of winning the prize after the intervention if the door is switched p=/N:
Probability of winning as a function of N. p=p=1/N is the thin line; p=N/is the thick one.Perhaps the most interesting aspect of this graphis that the N=3 case has the highest probability before the host intervention , but the lowest probability after and vice versa for N=100.
Another interesting feature is the quick climb in the probability of winning for the switchers:
N=3: p=67%
N=4: p=75%
N=5=80%
The switchers curve gradually reaches an asymptote approaching at 100% whereas at N=99 it is 98.99% and at N=100 is equal to 99%.
This starts to address an interesting question:
Why Is Switching Obvious For Large N But Not N=3?
The answer is the fact that this puzzle is slightly ambiguous. Only the highly attentive realise that by revealing the goatthe host is actually conveying a lot of information that should be incorporated into one’s calculation. Later we discuss the difference of doing this calculation in one’s mind based on intuition and slowing down by putting pen to paper or coding up the problem.
How much information is conveyed by the host by intervening?
A hand wavy explanation is that this information may be visualised as the gap between the lines in the graph above. For N=3 we saw that the odds of winning doubled, but that doesn’t register as strongly to our common sense intuition as the 99 factor as in the N=100.
I have also considered describing stronger arguments from Information Theory that provide useful vocabulary to express communication of information. However, I feel that this fascinating field deserves a post of its own, which I’ve published.
The main takeaway for the Monty Hall problem is that I have calculated the information gain to be a logarithmic function of the number of doors c using this formula:
Information Gain due to the intervention of the host for a setup with c doors. Full details in my upcoming article.
For c=3 door case, e.g, the information gain is ⅔ bits. Full details are in this article on entropy.
To summarise this section, we use basic probability arguments to quantify the probabilities of winning the prize showing the benefit of switching for all N door scenarios. For those interested in more formal solutions using Bayesian and Causality on the bottom I provide supplement sections.
In the next three final sections we’ll discuss how this problem was accepted in the general public back in the 1990s, discuss lessons learnt and then summarise how we can apply them in real-world settings.
Being Confused Is OK
“No, that is impossible, it should make no difference.” — Paul Erdős
If you still don’t feel comfortable with the solution of the N=3 Monty Hall problem, don’t worry you are in good company! According to Vazsonyi¹ even Paul Erdős who is considered “of the greatest experts in probability theory” was confounded until computer simulations were demonstrated to him.
When the original solution by Steve Selvin² was popularised by Marilyn vos Savant in her column “Ask Marilyn” in Parade magazine in 1990 many readers wrote that Selvin and Savant were wrong³. According to Tierney’s 1991 article in the New York Times, this included about 10,000 readers, including nearly 1,000 with Ph.D degrees⁴.
On a personal note, over a decade ago I was exposed to the standard N=3 problem and since then managed to forget the solution numerous times. When I learnt about the large N approach I was quite excited about how intuitive it was. I then failed to explain it to my technical manager over lunch, so this is an attempt to compensate. I still have the same day job .
While researching this piece I realised that there is a lot to learn in terms of decision making in general and in particular useful for data science.
Lessons Learnt From Monty Hall Problem
In his book Thinking Fast and Slow, the late Daniel Kahneman, the co-creator of Behaviour Economics, suggested that we have two types of thought processes:
System 1 — fast thinking : based on intuition. This helps us react fast with confidence to familiar situations.
System 2 – slow thinking : based on deep thought. This helps figure out new complex situations that life throws at us.
Assuming this premise, you might have noticed that in the above you were applying both.
By examining the visual of N=100 doors your System 1 kicked in and you immediately knew the answer. I’m guessing that in the N=3 you were straddling between System 1 and 2. Considering that you had to stop and think a bit when going throughout the probabilities exercise it was definitely System 2 .
The decision maker’s struggle between System 1 and System 2 . Generated using Gemini Imagen 3
Beyond the fast and slow thinking I feel that there are a lot of data decision making lessons that may be learnt.Assessing probabilities can be counter-intuitive …
or
Be comfortable with shifting to deep thought
We’ve clearly shown that in the N=3 case. As previously mentioned it confounded many people including prominent statisticians.
Another classic example is The Birthday Paradox , which shows how we underestimate the likelihood of coincidences. In this problem most people would think that one needs a large group of people until they find a pair sharing the same birthday. It turns out that all you need is 23 to have a 50% chance. And 70 for a 99.9% chance.
One of the most confusing paradoxes in the realm of data analysis is Simpson’s, which I detailed in a previous article. This is a situation where trends of a population may be reversed in its subpopulations.
The common with all these paradoxes is them requiring us to get comfortable to shifting gears from System 1 fast thinking to System 2 slow . This is also the common theme for the lessons outlined below.
A few more classical examples are: The Gambler’s Fallacy , Base Rate Fallacy and the The LindaProblem . These are beyond the scope of this article, but I highly recommend looking them up to further sharpen ways of thinking about data.… especially when dealing with ambiguity
or
Search for clarity in ambiguity
Let’s reread the problem, this time as stated in “Ask Marilyn”
Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say №1, and the host, who knows what’s behind the doors, opens another door, say №3, which has a goat. He then says to you, “Do you want to pick door №2?” Is it to your advantage to switch your choice?
We discussed that the most important piece of information is not made explicit. It says that the host “knows what’s behind the doors”, but not that they open a door at random, although it’s implicitly understood that the host will never open the door with the car.
Many real life problems in data science involve dealing with ambiguous demands as well as in data provided by stakeholders.
It is crucial for the researcher to track down any relevant piece of information that is likely to have an impact and update that into the solution. Statisticians refer to this as “belief update”.With new information we should update our beliefs
This is the main aspect separating the Bayesian stream of thought to the Frequentist. The Frequentist approach takes data at face value. The Bayesian approach incorporates prior beliefs and updates it when new findings are introduced. This is especially useful when dealing with ambiguous situations.
To drive this point home, let’s re-examine this figure comparing between the post intervention N=3 setupsand the N=100 one.
Copied from above. Post intervention settings for the N=3 setupand N=100.
In both cases we had a prior belief that all doors had an equal chance of winning the prize p=1/N.
Once the host opened one doora lot of valuable information was revealed whereas in the case of N=100 it was much more apparent than N=3.
In the Frequentist approach, however, most of this information would be ignored, as it only focuses on the two closed doors. The Frequentist conclusion, hence is a 50% chance to win the prize regardless of what else is known about the situation. Hence the Frequentist takes Paul Erdős’ “no difference” point of view, which we now know to be incorrect.
This would be reasonable if all that was presented were the two doors and not the intervention and the goats. However, if that information is presented, one should shift gears into System 2 thinking and update their beliefs in the system. This is what we have done by focusing not only on the shut door, but rather consider what was learnt about the system at large.
For the brave hearted , in a supplementary section below called The Bayesian Point of View I solve for the Monty Hall problem using the Bayesian formalism.Be one with subjectivity
The Frequentist main reservation about “going Bayes” is that — “Statistics should be objective”.
The Bayesian response is — the Frequentist’s also apply a prior without realising it — a flat one.
Regardless of the Bayesian/Frequentist debate, as researchers we try our best to be as objective as possible in every step of the analysis.
That said, it is inevitable that subjective decisions are made throughout.
E.g, in a skewed distribution should one quote the mean or median? It highly depends on the context and hence a subjective decision needs to be made.
The responsibility of the analyst is to provide justification for their choices first to convince themselves and then their stakeholders.When confused — look for a useful analogy
… but tread with caution
We saw that by going from the N=3 setup to the N=100 the solution was apparent. This is a trick scientists frequently use — if the problem appears at first a bit too confusing/overwhelming, break it down and try to find a useful analogy.
It is probably not a perfect comparison, but going from the N=3 setup to N=100 is like examining a picture from up close and zooming out to see the big picture. Think of having only a puzzle piece and then glancing at the jigsaw photo on the box.
Monty Hall in 1976. Credit: Wikipedia and using Visual Paradigm Online for the puzzle effect
Note: whereas analogies may be powerful, one should do so with caution, not to oversimplify. Physicists refer to this situation as the spherical cow method, where models may oversimplify complex phenomena.
I admit that even with years of experience in applied statistics at times I still get confused at which method to apply. A large part of my thought process is identifying analogies to known solved problems. Sometimes after making progress in a direction I will realise that my assumptions were wrong and seek a new direction. I used to quip with colleagues that they shouldn’t trust me before my third attempt …Simulations are powerful but not always necessary
It’s interesting to learn that Paul Erdős and other mathematicians were convinced only after seeing simulations of the problem.
I am two-minded about usage of simulations when it comes to problem solving.
On the one hand simulations are powerful tools to analyse complex and intractable problems. Especially in real life data in which one wants a grasp not only of the underlying formulation, but also stochasticity.
And here is the big BUT — if a problem can be analytically solved like the Monty Hall one, simulations as fun as they may be, may not be necessary.
According to Occam’s razor, all that is required is a brief intuition to explain the phenomena. This is what I attempted to do here by applying common sense and some basic probability reasoning. For those who enjoy deep dives I provide below supplementary sections with two methods for analytical solutions — one using Bayesian statistics and another using Causality.After publishing the first version of this article there was a comment that Savant’s solution³ may be simpler than those presented here. I revisited her communications and agreed that it should be added. In the process I realised three more lessons may be learnt.A well designed visual goes a long way
Continuing the principle of Occam’s razor, Savant explained³ quite convincingly in my opinion:
You should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?
Hence she provided an abstract visual for the readers. I attempted to do the same with the 100 doors figures.
Marilyn vos Savant who popularised the Monty Hall Problem. Credit: Ben David on Flickr under license
As mentioned many readers, and especially with backgrounds in maths and statistics, still weren’t convinced.
She revised³ with another mental image:
The benefits of switching are readily proven by playing through the six games that exhaust all the possibilities. For the first three games, you choose #1 and “switch” each time, for the second three games, you choose #1 and “stay” each time, and the host always opens a loser. Here are the results.
She added a table with all the scenarios. I took some artistic liberty and created the following figure. As indicated, the top batch are the scenarios in which the trader switches and the bottom when they switch. Lines in green are games which the trader wins, and in red when they get zonked. The symbolised the door chosen by the trader and Monte Hall then chooses a different door that has a goat behind it.
Adaptation of Savant’s table³ of six scenarios that shows the solution to the Monty Hall Problem
We clearly see from this diagram that the switcher has a ⅔ chance of winning and those that stay only ⅓.
This is yet another elegant visualisation that clearly explains the non intuitive.
It strengthens the claim that there is no real need for simulations in this case because all they would be doing is rerunning these six scenarios.
One more popular solution is decision tree illustrations. You can find these in the Wikipedia page, but I find it’s a bit redundant to Savant’s table.
The fact that we can solve this problem in so many ways yields another lesson:There are many ways to skin a … problem
Of the many lessons that I have learnt from the writings of late Richard Feynman, one of the best physics and ideas communicators, is that a problem can be solved many ways. Mathematicians and Physicists do this all the time.
A relevant quote that paraphrases Occam’s razor:
If you can’t explain it simply, you don’t understand it well enough — attributed to Albert Einstein
And finallyEmbrace ignorance and be humble
“You are utterly incorrect … How many irate mathematicians are needed to get you to change your mind?” — Ph.D from Georgetown University
“May I suggest that you obtain and refer to a standard textbook on probability before you try to answer a question of this type again?” — Ph.D from University of Florida
“You’re in error, but Albert Einstein earned a dearer place in the hearts of people after he admitted his errors.” — Ph.D. from University of Michigan
Ouch!
These are some of the said responses from mathematicians to the Parade article.
Such unnecessary viciousness.
You can check the reference³ to see the writer’s names and other like it. To whet your appetite: “You blew it, and you blew it big!”, , “You made a mistake, but look at the positive side. If all those Ph.D.’s were wrong, the country would be in some very serious trouble.”, “I am in shock that after being corrected by at least three mathematicians, you still do not see your mistake.”.
And as expected from the 1990s perhaps the most embarrassing one was from a resident of Oregon:
“Maybe women look at math problems differently than men.”
These make me cringe and be embarrassed to be associated by gender and Ph.D. title with these graduates and professors.
Hopefully in the 2020s most people are more humble about their ignorance. Yuval Noah Harari discusses the fact that the Scientific Revolution of Galileo Galilei et al. was not due to knowledge but rather admittance of ignorance.
“The great discovery that launched the Scientific Revolution was the discovery that humans do not know the answers to their most important questions” — Yuval Noah Harari
Fortunately for mathematicians’ image, there were also quiet a lot of more enlightened comments. I like this one from one Seth Kalson, Ph.D. of MIT:
You are indeed correct. My colleagues at work had a ball with this problem, and I dare say that most of them, including me at first, thought you were wrong!
We’ll summarise by examining how, and if, the Monty Hall problem may be applied in real-world settings, so you can try to relate to projects that you are working on.
Application in Real World Settings
Researching for this article I found that beyond artificial setups for entertainment⁶ ⁷ there aren’t practical settings for this problem to use as an analogy. Of course, I may be wrong⁸ and would be glad to hear if you know of one.
One way of assessing the viability of an analogy is using arguments from causality which provides vocabulary that cannot be expressed with standard statistics.
In a previous post I discussed the fact that the story behind the data is as important as the data itself. In particular Causal Graph Models visualise the story behind the data, which we will use as a framework for a reasonable analogy.
For the Monty Hall problem we can build a Causal Graph Model like this:
Reading:
The door chosen by the trader is independent from that with the prize and vice versa. As important, there is no common cause between them that might generate a spurious correlation.
The host’s choice depends on both and .
By comparing causal graphs of two systems one can get a sense for how analogous both are. A perfect analogy would require more details, but this is beyond the scope of this article. Briefly, one would want to ensure similar functions between the parameters.
Those interested in learning further details about using Causal Graphs Models to assess causality in real world problems may be interested in this article.
Anecdotally it is also worth mentioning that on Let’s Make a Deal, Monty himself has admitted years later to be playing mind games with the contestants and did not always follow the rules, e.g, not always doing the intervention as “it all depends on his mood”⁴.
In our setup we assumed perfect conditions, i.e., a host that does not skew from the script and/or play on the trader’s emotions. Taking this into consideration would require updating the Graphical Model above, which is beyond the scope of this article.
Some might be disheartened to realise at this stage of the post that there might not be real world applications for this problem.
I argue that lessons learnt from the Monty Hall problem definitely are.
Just to summarise them again:Assessing probabilities can be counter intuitive …… especially when dealing with ambiguityWith new information we should update our beliefsBe one with subjectivityWhen confused — look for a useful analogy … but tread with cautionSimulations are powerful but not always necessaryA well designed visual goes a long wayThere are many ways to skin a … problemEmbrace ignorance and be humble
While the Monty Hall Problem might seem like a simple puzzle, it offers valuable insights into decision-making, particularly for data scientists. The problem highlights the importance of going beyond intuition and embracing a more analytical, data-driven approach. By understanding the principles of Bayesian thinking and updating our beliefs based on new information, we can make more informed decisions in many aspects of our lives, including data science. The Monty Hall Problem serves as a reminder that even seemingly straightforward scenarios can contain hidden complexities and that by carefully examining available information, we can uncover hidden truths and make better decisions.
At the bottom of the article I provide a list of resources that I found useful to learn about this topic.
Credit: Wikipedia
Loved this post? Join me on LinkedIn or Buy me a coffee!
Credits
Unless otherwise noted, all images were created by the author.
Many thanks to Jim Parr, Will Reynolds, and Betty Kazin for their useful comments.
In the following supplementary sections I derive solutions to the Monty Hall’s problem from two perspectives:
Bayesian
Causal
Both are motivated by questions in textbook: Causal Inference in Statistics A Primer by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell.
Supplement 1: The Bayesian Point of View
This section assumes a basic understanding of Bayes’ Theorem, in particular being comfortable conditional probabilities. In other words if this makes sense:
We set out to use Bayes’ theorem to prove that switching doors improves chances in the N=3 Monty Hall Problem.We define
X — the chosen door
Y— the door with the prize
Z — the door opened by the host
Labelling the doors as A, B and C, without loss of generality, we need to solve for:
Using Bayes’ theorem we equate the left side as
and the right one as:
Most components are equal=P=⅓ so we are left to prove:
In the case where Y=B, the host has only one choice, making P= 1.
In the case where Y=A, the host has two choices, making P= 1/2.
From here:
Quod erat demonstrandum.
Note: if the “host choices” arguments didn’t make sense look at the table below showing this explicitly. You will want to compare entries {X=A, Y=B, Z=C} and {X=A, Y=A, Z=C}.
Supplement 2: The Causal Point of View
The section assumes a basic understanding of Directed Acyclic Graphsand Structural Causal Modelsis useful, but not required. In brief:
DAGs qualitatively visualise the causal relationships between the parameter nodes.
SCMs quantitatively express the formula relationships between the parameters.
Given the DAG
we are going to define the SCM that corresponds to the classic N=3 Monty Hall problem and use it to describe the joint distribution of all variables. We later will generically expand to N.We define
X — the chosen door
Y — the door with the prize
Z — the door opened by the host
According to the DAG we see that according to the chain rule:
The SCM is defined by exogenous variables U , endogenous variables V, and the functions between them F:
U = {X,Y}, V={Z}, F= {f}
where X, Y and Z have door values:
D = {A, B, C}
The host choice is fdefined as:
In order to generalise to N doors, the DAG remains the same, but the SCM requires to update D to be a set of N doors Dᵢ: {D₁, D₂, … Dₙ}.
Exploring Example Scenarios
To gain an intuition for this SCM, let’s examine 6 examples of 27:
When X=YP= 0; cannot choose the participant’s door
P= 1/2; is behind → chooses B at 50%
P= 1/2; is behind → chooses C at 50%When X≠YP= 0; cannot choose the participant’s door
P= 0; cannot choose prize door
P= 1; has not choice in the matterCalculating Joint Probabilities
Using logic let’s code up all 27 possibilities in python
df = pd.DataFrame++, "Y":++)* 3, "Z":* 9})
df= None
p_x = 1./3
p_y = 1./3
df.loc= 0
df.loc= 0.5
df.loc= 0
df.loc= 0
df.loc= 1
df= df* p_x * p_y
print{df.sum}")
df
yields
Resources
This Quora discussion by Joshua Engel helped me shape a few aspects of this article.
Causal Inference in Statistics A Primer / Pearl, Glymour & Jewell— excellent short text bookI also very much enjoy Tim Harford’s podcast Cautionary Tales. He wrote about this topic on November 3rd 2017 for the Financial Times: Monty Hall and the game show stick-or-switch conundrum
Footnotes
¹ Vazsonyi, Andrew. “Which Door Has the Cadillac?”. Decision Line: 17–19. Archived from the originalon 13 April 2014. Retrieved 16 October 2012.
² Steve Selvin to the American Statistician in 1975.³Game Show Problem by Marilyn vos Savant’s “Ask Marilyn” in marilynvossavant.com: “This material in this article was originally published in PARADE magazine in 1990 and 1991”
⁴Tierney, John. “Behind Monty Hall’s Doors: Puzzle, Debate and Answer?”. The New York Times. Retrieved 18 January 2008.
⁵ Kahneman, D.. Thinking, fast and slow. Farrar, Straus and Giroux.
⁶ MythBusters Episode 177 “Pick a Door”Watch Mythbuster’s approach
⁶Monty Hall Problem on Survivor Season 41Watch Survivor’s take on the problem
⁷ Jingyi Jessica LiHow the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis.Whereas the author points about “similarities” between hypothesis testing and the Monty Hall problem, I think that this is a bit misleading. The author is correct that both problems change by the order in which processes are done, but that is part of Bayesian statistics in general, not limited to the Monty Hall problem.
The post
Lessons in Decision Making from the Monty Hall Problem appeared first on Towards Data Science.
#lessons #decision #making #monty #hall
🚪🚪🐐 Lessons in Decision Making from the Monty Hall Problem
The Monty Hall Problem is a well-known brain teaser from which we can learn important lessons in Decision Making that are useful in general and in particular for data scientists.
If you are not familiar with this problem, prepare to be perplexed . If you are, I hope to shine light on aspects that you might not have considered .
I introduce the problem and solve with three types of intuitions:
Common — The heart of this post focuses on applying our common sense to solve this problem. We’ll explore why it fails us and what we can do to intuitively overcome this to make the solution crystal clear . We’ll do this by using visuals , qualitative arguments and some basic probabilities.
Bayesian — We will briefly discuss the importance of belief propagation.
Causal — We will use a Graph Model to visualise conditions required to use the Monty Hall problem in real world settings.Spoiler alert I haven’t been convinced that there are any, but the thought process is very useful.
I summarise by discussing lessons learnt for better data decision making.
In regards to the Bayesian and Causal intuitions, these will be presented in a gentle form. For the mathematically inclined I also provide supplementary sections with short Deep Dives into each approach after the summary.By examining different aspects of this puzzle in probability you will hopefully be able to improve your data decision making .
Credit: Wikipedia
First, some history. Let’s Make a Deal is a USA television game show that originated in 1963. As its premise, audience participants were considered traders making deals with the host, Monty Hall .
At the heart of the matter is an apparently simple scenario:
A trader is posed with the question of choosing one of three doors for the opportunity to win a luxurious prize, e.g, a car . Behind the other two were goats .
The trader is shown three closed doors.
The trader chooses one of the doors. Let’s call thisdoor A and mark it with a .
Keeping the chosen door closed, the host reveals one of the remaining doors showing a goat.
The trader chooses door and the the host reveals door C showing a goat.
The host then asks the trader if they would like to stick with their first choice or switch to the other remaining one.
If the trader guesses correct they win the prize . If not they’ll be shown another goat.
What is the probability of being Zonked? Credit: Wikipedia
Should the trader stick with their original choice of door A or switch to B?
Before reading further, give it a go. What would you do?
Most people are likely to have a gut intuition that “it doesn’t matter” arguing that in the first instance each door had a ⅓ chance of hiding the prize, and that after the host intervention , when only two doors remain closed, the winning of the prize is 50:50.
There are various ways of explaining why the coin toss intuition is incorrect. Most of these involve maths equations, or simulations. Whereas we will address these later, we’ll attempt to solve by applying Occam’s razor:
A principle that states that simpler explanations are preferable to more complex ones — William of OckhamTo do this it is instructive to slightly redefine the problem to a large N doors instead of the original three.
The Large N-Door Problem
Similar to before: you have to choose one of many doors. For illustration let’s say N=100. Behind one of the doors there is the prize and behind 99of the rest are goats .
The 100 Door Monty Hall problem before the host intervention.
You choose one door and the host reveals 98of the other doors that have goats leaving yours and one more closed .
The 100 Door Monty Hall Problem after the host intervention. Should you stick with your door or make the switch?
Should you stick with your original choice or make the switch?
I think you’ll agree with me that the remaining door, not chosen by you, is much more likely to conceal the prize … so you should definitely make the switch!
It’s illustrative to compare both scenarios discussed so far. In the next figure we compare the post host intervention for the N=3 setupand that of N=100:
Post intervention settings for the N=3 setupand N=100.
In both cases we see two shut doors, one of which we’ve chosen. The main difference between these scenarios is that in the first we see one goat and in the second there are more than the eye would care to see.
Why do most people consider the first case as a “50:50” toss up and in the second it’s obvious to make the switch?
We’ll soon address this question of why. First let’s put probabilities of success behind the different scenarios.
What’s The Frequency, Kenneth?
So far we learnt from the N=100 scenario that switching doors is obviously beneficial. Inferring for the N=3 may be a leap of faith for most. Using some basic probability arguments here we’ll quantify why it is favourable to make the switch for any number door scenario N.
We start with the standard Monty Hall problem. When it starts the probability of the prize being behind each of the doors A, B and C is p=⅓. To be explicit let’s define the Y parameter to be the door with the prize , i.e, p= p=p=⅓.
The trick to solving this problem is that once the trader’s door A has been chosen , we should pay close attention to the set of the other doors {B,C}, which has the probability of p=p+p=⅔. This visual may help make sense of this:
By being attentive to the {B,C} the rest should follow. When the goat is revealed
it is apparent that the probabilities post intervention change. Note that for ease of reading I’ll drop the Y notation, where pwill read pand pwill read p. Also for completeness the full terms after the intervention should be even longer due to it being conditional, e.g, p, p, where Z is a parameter representing the choice of the host .premains ⅓
p=p+premains ⅔,
p=0; we just learnt that the goat is behind door C, not the prize.
p= p-p= ⅔
For anyone with the information provided by the hostthis means that it isn’t a toss of a fair coin! For them the fact that pbecame zero does not “raise all other boats”, but rather premains the same and pgets doubled.
The bottom line is that the trader should consider p= ⅓ and p=⅔, hence by switching they are doubling the odds at winning!
Let’s generalise to N.
When we start all doors have odds of winning the prize p=1/N. After the trader chooses one door which we’ll call D₁, meaning p=1/N, we should now pay attention to the remaining set of doors {D₂, …, Dₙ} will have a chance of p=/N.
When the host revealsdoors {D₃, …, Dₙ} with goats:
premains 1/N
p=p+p+… + premains/N
p=p= …=p=p= 0; we just learnt that they have goats, not the prize.
p=p— p— … — p=/N
The trader should now consider two door values p=1/N and p=/N.
Hence the odds of winning improved by a factor of N-1! In the case of N=100, this means by an odds ratio of 99!.
The improvement of odds ratios in all scenarios between N=3 to 100 may be seen in the following graph. The thin line is the probability of winning by choosing any door prior to the intervention p=1/N. Note that it also represents the chance of winning after the intervention, if they decide to stick to their guns and not switch p.The thick line is the probability of winning the prize after the intervention if the door is switched p=/N:
Probability of winning as a function of N. p=p=1/N is the thin line; p=N/is the thick one.Perhaps the most interesting aspect of this graphis that the N=3 case has the highest probability before the host intervention , but the lowest probability after and vice versa for N=100.
Another interesting feature is the quick climb in the probability of winning for the switchers:
N=3: p=67%
N=4: p=75%
N=5=80%
The switchers curve gradually reaches an asymptote approaching at 100% whereas at N=99 it is 98.99% and at N=100 is equal to 99%.
This starts to address an interesting question:
Why Is Switching Obvious For Large N But Not N=3?
The answer is the fact that this puzzle is slightly ambiguous. Only the highly attentive realise that by revealing the goatthe host is actually conveying a lot of information that should be incorporated into one’s calculation. Later we discuss the difference of doing this calculation in one’s mind based on intuition and slowing down by putting pen to paper or coding up the problem.
How much information is conveyed by the host by intervening?
A hand wavy explanation is that this information may be visualised as the gap between the lines in the graph above. For N=3 we saw that the odds of winning doubled, but that doesn’t register as strongly to our common sense intuition as the 99 factor as in the N=100.
I have also considered describing stronger arguments from Information Theory that provide useful vocabulary to express communication of information. However, I feel that this fascinating field deserves a post of its own, which I’ve published.
The main takeaway for the Monty Hall problem is that I have calculated the information gain to be a logarithmic function of the number of doors c using this formula:
Information Gain due to the intervention of the host for a setup with c doors. Full details in my upcoming article.
For c=3 door case, e.g, the information gain is ⅔ bits. Full details are in this article on entropy.
To summarise this section, we use basic probability arguments to quantify the probabilities of winning the prize showing the benefit of switching for all N door scenarios. For those interested in more formal solutions using Bayesian and Causality on the bottom I provide supplement sections.
In the next three final sections we’ll discuss how this problem was accepted in the general public back in the 1990s, discuss lessons learnt and then summarise how we can apply them in real-world settings.
Being Confused Is OK
“No, that is impossible, it should make no difference.” — Paul Erdős
If you still don’t feel comfortable with the solution of the N=3 Monty Hall problem, don’t worry you are in good company! According to Vazsonyi¹ even Paul Erdős who is considered “of the greatest experts in probability theory” was confounded until computer simulations were demonstrated to him.
When the original solution by Steve Selvin² was popularised by Marilyn vos Savant in her column “Ask Marilyn” in Parade magazine in 1990 many readers wrote that Selvin and Savant were wrong³. According to Tierney’s 1991 article in the New York Times, this included about 10,000 readers, including nearly 1,000 with Ph.D degrees⁴.
On a personal note, over a decade ago I was exposed to the standard N=3 problem and since then managed to forget the solution numerous times. When I learnt about the large N approach I was quite excited about how intuitive it was. I then failed to explain it to my technical manager over lunch, so this is an attempt to compensate. I still have the same day job .
While researching this piece I realised that there is a lot to learn in terms of decision making in general and in particular useful for data science.
Lessons Learnt From Monty Hall Problem
In his book Thinking Fast and Slow, the late Daniel Kahneman, the co-creator of Behaviour Economics, suggested that we have two types of thought processes:
System 1 — fast thinking : based on intuition. This helps us react fast with confidence to familiar situations.
System 2 – slow thinking : based on deep thought. This helps figure out new complex situations that life throws at us.
Assuming this premise, you might have noticed that in the above you were applying both.
By examining the visual of N=100 doors your System 1 kicked in and you immediately knew the answer. I’m guessing that in the N=3 you were straddling between System 1 and 2. Considering that you had to stop and think a bit when going throughout the probabilities exercise it was definitely System 2 .
The decision maker’s struggle between System 1 and System 2 . Generated using Gemini Imagen 3
Beyond the fast and slow thinking I feel that there are a lot of data decision making lessons that may be learnt.Assessing probabilities can be counter-intuitive …
or
Be comfortable with shifting to deep thought
We’ve clearly shown that in the N=3 case. As previously mentioned it confounded many people including prominent statisticians.
Another classic example is The Birthday Paradox , which shows how we underestimate the likelihood of coincidences. In this problem most people would think that one needs a large group of people until they find a pair sharing the same birthday. It turns out that all you need is 23 to have a 50% chance. And 70 for a 99.9% chance.
One of the most confusing paradoxes in the realm of data analysis is Simpson’s, which I detailed in a previous article. This is a situation where trends of a population may be reversed in its subpopulations.
The common with all these paradoxes is them requiring us to get comfortable to shifting gears from System 1 fast thinking to System 2 slow . This is also the common theme for the lessons outlined below.
A few more classical examples are: The Gambler’s Fallacy , Base Rate Fallacy and the The LindaProblem . These are beyond the scope of this article, but I highly recommend looking them up to further sharpen ways of thinking about data.… especially when dealing with ambiguity
or
Search for clarity in ambiguity
Let’s reread the problem, this time as stated in “Ask Marilyn”
Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say №1, and the host, who knows what’s behind the doors, opens another door, say №3, which has a goat. He then says to you, “Do you want to pick door №2?” Is it to your advantage to switch your choice?
We discussed that the most important piece of information is not made explicit. It says that the host “knows what’s behind the doors”, but not that they open a door at random, although it’s implicitly understood that the host will never open the door with the car.
Many real life problems in data science involve dealing with ambiguous demands as well as in data provided by stakeholders.
It is crucial for the researcher to track down any relevant piece of information that is likely to have an impact and update that into the solution. Statisticians refer to this as “belief update”.With new information we should update our beliefs
This is the main aspect separating the Bayesian stream of thought to the Frequentist. The Frequentist approach takes data at face value. The Bayesian approach incorporates prior beliefs and updates it when new findings are introduced. This is especially useful when dealing with ambiguous situations.
To drive this point home, let’s re-examine this figure comparing between the post intervention N=3 setupsand the N=100 one.
Copied from above. Post intervention settings for the N=3 setupand N=100.
In both cases we had a prior belief that all doors had an equal chance of winning the prize p=1/N.
Once the host opened one doora lot of valuable information was revealed whereas in the case of N=100 it was much more apparent than N=3.
In the Frequentist approach, however, most of this information would be ignored, as it only focuses on the two closed doors. The Frequentist conclusion, hence is a 50% chance to win the prize regardless of what else is known about the situation. Hence the Frequentist takes Paul Erdős’ “no difference” point of view, which we now know to be incorrect.
This would be reasonable if all that was presented were the two doors and not the intervention and the goats. However, if that information is presented, one should shift gears into System 2 thinking and update their beliefs in the system. This is what we have done by focusing not only on the shut door, but rather consider what was learnt about the system at large.
For the brave hearted , in a supplementary section below called The Bayesian Point of View I solve for the Monty Hall problem using the Bayesian formalism.Be one with subjectivity
The Frequentist main reservation about “going Bayes” is that — “Statistics should be objective”.
The Bayesian response is — the Frequentist’s also apply a prior without realising it — a flat one.
Regardless of the Bayesian/Frequentist debate, as researchers we try our best to be as objective as possible in every step of the analysis.
That said, it is inevitable that subjective decisions are made throughout.
E.g, in a skewed distribution should one quote the mean or median? It highly depends on the context and hence a subjective decision needs to be made.
The responsibility of the analyst is to provide justification for their choices first to convince themselves and then their stakeholders.When confused — look for a useful analogy
… but tread with caution
We saw that by going from the N=3 setup to the N=100 the solution was apparent. This is a trick scientists frequently use — if the problem appears at first a bit too confusing/overwhelming, break it down and try to find a useful analogy.
It is probably not a perfect comparison, but going from the N=3 setup to N=100 is like examining a picture from up close and zooming out to see the big picture. Think of having only a puzzle piece and then glancing at the jigsaw photo on the box.
Monty Hall in 1976. Credit: Wikipedia and using Visual Paradigm Online for the puzzle effect
Note: whereas analogies may be powerful, one should do so with caution, not to oversimplify. Physicists refer to this situation as the spherical cow method, where models may oversimplify complex phenomena.
I admit that even with years of experience in applied statistics at times I still get confused at which method to apply. A large part of my thought process is identifying analogies to known solved problems. Sometimes after making progress in a direction I will realise that my assumptions were wrong and seek a new direction. I used to quip with colleagues that they shouldn’t trust me before my third attempt …Simulations are powerful but not always necessary
It’s interesting to learn that Paul Erdős and other mathematicians were convinced only after seeing simulations of the problem.
I am two-minded about usage of simulations when it comes to problem solving.
On the one hand simulations are powerful tools to analyse complex and intractable problems. Especially in real life data in which one wants a grasp not only of the underlying formulation, but also stochasticity.
And here is the big BUT — if a problem can be analytically solved like the Monty Hall one, simulations as fun as they may be, may not be necessary.
According to Occam’s razor, all that is required is a brief intuition to explain the phenomena. This is what I attempted to do here by applying common sense and some basic probability reasoning. For those who enjoy deep dives I provide below supplementary sections with two methods for analytical solutions — one using Bayesian statistics and another using Causality.After publishing the first version of this article there was a comment that Savant’s solution³ may be simpler than those presented here. I revisited her communications and agreed that it should be added. In the process I realised three more lessons may be learnt.A well designed visual goes a long way
Continuing the principle of Occam’s razor, Savant explained³ quite convincingly in my opinion:
You should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?
Hence she provided an abstract visual for the readers. I attempted to do the same with the 100 doors figures.
Marilyn vos Savant who popularised the Monty Hall Problem. Credit: Ben David on Flickr under license
As mentioned many readers, and especially with backgrounds in maths and statistics, still weren’t convinced.
She revised³ with another mental image:
The benefits of switching are readily proven by playing through the six games that exhaust all the possibilities. For the first three games, you choose #1 and “switch” each time, for the second three games, you choose #1 and “stay” each time, and the host always opens a loser. Here are the results.
She added a table with all the scenarios. I took some artistic liberty and created the following figure. As indicated, the top batch are the scenarios in which the trader switches and the bottom when they switch. Lines in green are games which the trader wins, and in red when they get zonked. The symbolised the door chosen by the trader and Monte Hall then chooses a different door that has a goat behind it.
Adaptation of Savant’s table³ of six scenarios that shows the solution to the Monty Hall Problem
We clearly see from this diagram that the switcher has a ⅔ chance of winning and those that stay only ⅓.
This is yet another elegant visualisation that clearly explains the non intuitive.
It strengthens the claim that there is no real need for simulations in this case because all they would be doing is rerunning these six scenarios.
One more popular solution is decision tree illustrations. You can find these in the Wikipedia page, but I find it’s a bit redundant to Savant’s table.
The fact that we can solve this problem in so many ways yields another lesson:There are many ways to skin a … problem
Of the many lessons that I have learnt from the writings of late Richard Feynman, one of the best physics and ideas communicators, is that a problem can be solved many ways. Mathematicians and Physicists do this all the time.
A relevant quote that paraphrases Occam’s razor:
If you can’t explain it simply, you don’t understand it well enough — attributed to Albert Einstein
And finallyEmbrace ignorance and be humble
“You are utterly incorrect … How many irate mathematicians are needed to get you to change your mind?” — Ph.D from Georgetown University
“May I suggest that you obtain and refer to a standard textbook on probability before you try to answer a question of this type again?” — Ph.D from University of Florida
“You’re in error, but Albert Einstein earned a dearer place in the hearts of people after he admitted his errors.” — Ph.D. from University of Michigan
Ouch!
These are some of the said responses from mathematicians to the Parade article.
Such unnecessary viciousness.
You can check the reference³ to see the writer’s names and other like it. To whet your appetite: “You blew it, and you blew it big!”, , “You made a mistake, but look at the positive side. If all those Ph.D.’s were wrong, the country would be in some very serious trouble.”, “I am in shock that after being corrected by at least three mathematicians, you still do not see your mistake.”.
And as expected from the 1990s perhaps the most embarrassing one was from a resident of Oregon:
“Maybe women look at math problems differently than men.”
These make me cringe and be embarrassed to be associated by gender and Ph.D. title with these graduates and professors.
Hopefully in the 2020s most people are more humble about their ignorance. Yuval Noah Harari discusses the fact that the Scientific Revolution of Galileo Galilei et al. was not due to knowledge but rather admittance of ignorance.
“The great discovery that launched the Scientific Revolution was the discovery that humans do not know the answers to their most important questions” — Yuval Noah Harari
Fortunately for mathematicians’ image, there were also quiet a lot of more enlightened comments. I like this one from one Seth Kalson, Ph.D. of MIT:
You are indeed correct. My colleagues at work had a ball with this problem, and I dare say that most of them, including me at first, thought you were wrong!
We’ll summarise by examining how, and if, the Monty Hall problem may be applied in real-world settings, so you can try to relate to projects that you are working on.
Application in Real World Settings
Researching for this article I found that beyond artificial setups for entertainment⁶ ⁷ there aren’t practical settings for this problem to use as an analogy. Of course, I may be wrong⁸ and would be glad to hear if you know of one.
One way of assessing the viability of an analogy is using arguments from causality which provides vocabulary that cannot be expressed with standard statistics.
In a previous post I discussed the fact that the story behind the data is as important as the data itself. In particular Causal Graph Models visualise the story behind the data, which we will use as a framework for a reasonable analogy.
For the Monty Hall problem we can build a Causal Graph Model like this:
Reading:
The door chosen by the trader is independent from that with the prize and vice versa. As important, there is no common cause between them that might generate a spurious correlation.
The host’s choice depends on both and .
By comparing causal graphs of two systems one can get a sense for how analogous both are. A perfect analogy would require more details, but this is beyond the scope of this article. Briefly, one would want to ensure similar functions between the parameters.
Those interested in learning further details about using Causal Graphs Models to assess causality in real world problems may be interested in this article.
Anecdotally it is also worth mentioning that on Let’s Make a Deal, Monty himself has admitted years later to be playing mind games with the contestants and did not always follow the rules, e.g, not always doing the intervention as “it all depends on his mood”⁴.
In our setup we assumed perfect conditions, i.e., a host that does not skew from the script and/or play on the trader’s emotions. Taking this into consideration would require updating the Graphical Model above, which is beyond the scope of this article.
Some might be disheartened to realise at this stage of the post that there might not be real world applications for this problem.
I argue that lessons learnt from the Monty Hall problem definitely are.
Just to summarise them again:Assessing probabilities can be counter intuitive …… especially when dealing with ambiguityWith new information we should update our beliefsBe one with subjectivityWhen confused — look for a useful analogy … but tread with cautionSimulations are powerful but not always necessaryA well designed visual goes a long wayThere are many ways to skin a … problemEmbrace ignorance and be humble
While the Monty Hall Problem might seem like a simple puzzle, it offers valuable insights into decision-making, particularly for data scientists. The problem highlights the importance of going beyond intuition and embracing a more analytical, data-driven approach. By understanding the principles of Bayesian thinking and updating our beliefs based on new information, we can make more informed decisions in many aspects of our lives, including data science. The Monty Hall Problem serves as a reminder that even seemingly straightforward scenarios can contain hidden complexities and that by carefully examining available information, we can uncover hidden truths and make better decisions.
At the bottom of the article I provide a list of resources that I found useful to learn about this topic.
Credit: Wikipedia
Loved this post? Join me on LinkedIn or Buy me a coffee!
Credits
Unless otherwise noted, all images were created by the author.
Many thanks to Jim Parr, Will Reynolds, and Betty Kazin for their useful comments.
In the following supplementary sections I derive solutions to the Monty Hall’s problem from two perspectives:
Bayesian
Causal
Both are motivated by questions in textbook: Causal Inference in Statistics A Primer by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell.
Supplement 1: The Bayesian Point of View
This section assumes a basic understanding of Bayes’ Theorem, in particular being comfortable conditional probabilities. In other words if this makes sense:
We set out to use Bayes’ theorem to prove that switching doors improves chances in the N=3 Monty Hall Problem.We define
X — the chosen door
Y— the door with the prize
Z — the door opened by the host
Labelling the doors as A, B and C, without loss of generality, we need to solve for:
Using Bayes’ theorem we equate the left side as
and the right one as:
Most components are equal=P=⅓ so we are left to prove:
In the case where Y=B, the host has only one choice, making P= 1.
In the case where Y=A, the host has two choices, making P= 1/2.
From here:
Quod erat demonstrandum.
Note: if the “host choices” arguments didn’t make sense look at the table below showing this explicitly. You will want to compare entries {X=A, Y=B, Z=C} and {X=A, Y=A, Z=C}.
Supplement 2: The Causal Point of View
The section assumes a basic understanding of Directed Acyclic Graphsand Structural Causal Modelsis useful, but not required. In brief:
DAGs qualitatively visualise the causal relationships between the parameter nodes.
SCMs quantitatively express the formula relationships between the parameters.
Given the DAG
we are going to define the SCM that corresponds to the classic N=3 Monty Hall problem and use it to describe the joint distribution of all variables. We later will generically expand to N.We define
X — the chosen door
Y — the door with the prize
Z — the door opened by the host
According to the DAG we see that according to the chain rule:
The SCM is defined by exogenous variables U , endogenous variables V, and the functions between them F:
U = {X,Y}, V={Z}, F= {f}
where X, Y and Z have door values:
D = {A, B, C}
The host choice is fdefined as:
In order to generalise to N doors, the DAG remains the same, but the SCM requires to update D to be a set of N doors Dᵢ: {D₁, D₂, … Dₙ}.
Exploring Example Scenarios
To gain an intuition for this SCM, let’s examine 6 examples of 27:
When X=YP= 0; cannot choose the participant’s door
P= 1/2; is behind → chooses B at 50%
P= 1/2; is behind → chooses C at 50%When X≠YP= 0; cannot choose the participant’s door
P= 0; cannot choose prize door
P= 1; has not choice in the matterCalculating Joint Probabilities
Using logic let’s code up all 27 possibilities in python
df = pd.DataFrame++, "Y":++)* 3, "Z":* 9})
df= None
p_x = 1./3
p_y = 1./3
df.loc= 0
df.loc= 0.5
df.loc= 0
df.loc= 0
df.loc= 1
df= df* p_x * p_y
print{df.sum}")
df
yields
Resources
This Quora discussion by Joshua Engel helped me shape a few aspects of this article.
Causal Inference in Statistics A Primer / Pearl, Glymour & Jewell— excellent short text bookI also very much enjoy Tim Harford’s podcast Cautionary Tales. He wrote about this topic on November 3rd 2017 for the Financial Times: Monty Hall and the game show stick-or-switch conundrum
Footnotes
¹ Vazsonyi, Andrew. “Which Door Has the Cadillac?”. Decision Line: 17–19. Archived from the originalon 13 April 2014. Retrieved 16 October 2012.
² Steve Selvin to the American Statistician in 1975.³Game Show Problem by Marilyn vos Savant’s “Ask Marilyn” in marilynvossavant.com: “This material in this article was originally published in PARADE magazine in 1990 and 1991”
⁴Tierney, John. “Behind Monty Hall’s Doors: Puzzle, Debate and Answer?”. The New York Times. Retrieved 18 January 2008.
⁵ Kahneman, D.. Thinking, fast and slow. Farrar, Straus and Giroux.
⁶ MythBusters Episode 177 “Pick a Door”Watch Mythbuster’s approach
⁶Monty Hall Problem on Survivor Season 41Watch Survivor’s take on the problem
⁷ Jingyi Jessica LiHow the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis.Whereas the author points about “similarities” between hypothesis testing and the Monty Hall problem, I think that this is a bit misleading. The author is correct that both problems change by the order in which processes are done, but that is part of Bayesian statistics in general, not limited to the Monty Hall problem.
The post 🚪🚪🐐 Lessons in Decision Making from the Monty Hall Problem appeared first on Towards Data Science.
#lessons #decision #making #monty #hall