• Liquid Glass’ aesthetic entropy: your interface, your problem

    Apple’s Opinionated Design principles are a thing of the past. Enter the world of reckless customization.Continue reading on UX Collective »
    #liquid #glass #aesthetic #entropy #your
    Liquid Glass’ aesthetic entropy: your interface, your problem
    Apple’s Opinionated Design principles are a thing of the past. Enter the world of reckless customization.Continue reading on UX Collective » #liquid #glass #aesthetic #entropy #your
    UXDESIGN.CC
    Liquid Glass’ aesthetic entropy: your interface, your problem
    Apple’s Opinionated Design principles are a thing of the past. Enter the world of reckless customization.Continue reading on UX Collective »
    Like
    Love
    Wow
    Angry
    Sad
    473
    2 Comentários 0 Compartilhamentos 0 Anterior
  • Qwen Researchers Proposes QwenLong-L1: A Reinforcement Learning Framework for Long-Context Reasoning in Large Language Models

    While large reasoning modelshave shown impressive capabilities in short-context reasoning through reinforcement learning, these gains do not generalize well to long-context scenarios. Applications such as multi-document QA, research synthesis, and legal or financial analysis require models to process and reason over sequences exceeding 100K tokens. However, RL optimization in such regimes is plagued by slower reward convergence, unstable policy updates due to KL divergence fluctuations, and reduced exploration resulting from entropy collapse. These bottlenecks reveal a fundamental gap in transitioning LRMs from short-context proficiency to long-context generalization.
    QwenLong-L1: A Structured RL Framework for Long-Context Adaptation
    To address these limitations, the Qwen Research team introduces QwenLong-L1, a novel RL framework designed to adapt LRMs to long-context reasoning tasks. The framework is structured into three key stages:

    Warm-up Supervised Fine-Tuning: Provides a stable initialization for the policy model by training on curated question-context-answer triplets, ensuring basic competence in contextual comprehension and answer extraction.
    Curriculum-Guided Phased Reinforcement Learning: Introduces a staged training process with gradually increasing context lengths. This progression enables the model to incrementally acquire long-context reasoning behaviors without destabilizing policy updates.
    Difficulty-Aware Retrospective Sampling: Enhances exploration by maintaining and reusing hard examples from previous phases, weighted by their difficulty, to encourage deeper reasoning and robustness across diverse inputs.

    These stages are complemented by hybrid reward mechanisms—combining rule-based exact match verification with semantic evaluation by a lightweight LLM—ensuring both precision and recall during policy training.

    Technical Design and Methodological Advantages
    QwenLong-L1 integrates recent advances in group-relative RL optimization, specifically GRPO and DAPO, to mitigate the computational overhead associated with long-context value estimation:

    GRPO estimates advantage by normalizing rewards within sampled groups, eliminating the need for a separate value network and encouraging diverse generation patterns.
    DAPO incorporates mechanisms such as dynamic sampling, overlength penalty shaping, and asymmetric clipping thresholds to prevent entropy collapse and mitigate length biases during training.

    The reward function is defined as the maximum of two signals: a deterministic rule-based match and a semantic judgment from a compact evaluator model. This hybrid approach avoids overfitting to rigid formats while maintaining answer correctness across varied notations and phrasings.
    Moreover, the framework is optimized via progressive context scaling, where the RL process transitions from 20K-token to 60K-token input lengths in controlled phases, stabilizing training dynamics and facilitating policy generalization.
    Experimental Results and Benchmark Performance
    QwenLong-L1 was evaluated on seven long-context document QA benchmarks, including DocMath, Frames, 2WikiMultihopQA, HotpotQA, Musique, NarrativeQA, and Qasper. The 32B variant, QwenLong-L1-32B, demonstrated strong empirical performance:

    It outperformed baseline models such as R1-Distill-Qwen-32B by 5.1 points and exceeded leading proprietary systems like OpenAI-o3-mini and Qwen3-235B-A22B.
    Its performance was comparable to Claude-3.7-Sonnet-Thinking, indicating competitive reasoning capabilities under extreme context lengths.
    Pass@K analysis revealed consistent improvements with increased sampling, achieving a Pass@2 average of 73.7, surpassing DeepSeek-R1 and OpenAI-o1-preview, even at low sampling rates.

    Ablation studies further validated the individual contributions of SFT, phased RL, and retrospective sampling. Notably, RL played a decisive role in enabling emergent reasoning behaviors such as grounding, subgoal setting, verification, and backtracking—traits not effectively induced by supervised fine-tuning alone.
    Conclusion
    QwenLong-L1 represents a systematic approach to equipping LRMs with robust long-context reasoning capabilities through reinforcement learning. Its design effectively bridges the gap between short-context expertise and the demands of information-dense environments by combining supervised initialization, curriculum-driven context scaling, and hybrid evaluation strategies. The framework not only achieves state-of-the-art results across long-context benchmarks but also demonstrates the emergence of interpretable reasoning patterns during training.

    Check out the Paper, Model on Hugging Face and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.
    Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/NVIDIA Releases Llama Nemotron Nano 4B: An Efficient Open Reasoning Model Optimized for Edge AI and Scientific TasksAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Coding Implementation to Build an AI Agent with Live Python Execution and Automated ValidationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Step-by-Step Guide to Build a Customizable Multi-Tool AI Agent with LangGraph and Claude for Dynamic Agent CreationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Comprehensive Coding Guide to Crafting Advanced Round-Robin Multi-Agent Workflows with Microsoft AutoGen
    #qwen #researchers #proposes #qwenlongl1 #reinforcement
    Qwen Researchers Proposes QwenLong-L1: A Reinforcement Learning Framework for Long-Context Reasoning in Large Language Models
    While large reasoning modelshave shown impressive capabilities in short-context reasoning through reinforcement learning, these gains do not generalize well to long-context scenarios. Applications such as multi-document QA, research synthesis, and legal or financial analysis require models to process and reason over sequences exceeding 100K tokens. However, RL optimization in such regimes is plagued by slower reward convergence, unstable policy updates due to KL divergence fluctuations, and reduced exploration resulting from entropy collapse. These bottlenecks reveal a fundamental gap in transitioning LRMs from short-context proficiency to long-context generalization. QwenLong-L1: A Structured RL Framework for Long-Context Adaptation To address these limitations, the Qwen Research team introduces QwenLong-L1, a novel RL framework designed to adapt LRMs to long-context reasoning tasks. The framework is structured into three key stages: Warm-up Supervised Fine-Tuning: Provides a stable initialization for the policy model by training on curated question-context-answer triplets, ensuring basic competence in contextual comprehension and answer extraction. Curriculum-Guided Phased Reinforcement Learning: Introduces a staged training process with gradually increasing context lengths. This progression enables the model to incrementally acquire long-context reasoning behaviors without destabilizing policy updates. Difficulty-Aware Retrospective Sampling: Enhances exploration by maintaining and reusing hard examples from previous phases, weighted by their difficulty, to encourage deeper reasoning and robustness across diverse inputs. These stages are complemented by hybrid reward mechanisms—combining rule-based exact match verification with semantic evaluation by a lightweight LLM—ensuring both precision and recall during policy training. Technical Design and Methodological Advantages QwenLong-L1 integrates recent advances in group-relative RL optimization, specifically GRPO and DAPO, to mitigate the computational overhead associated with long-context value estimation: GRPO estimates advantage by normalizing rewards within sampled groups, eliminating the need for a separate value network and encouraging diverse generation patterns. DAPO incorporates mechanisms such as dynamic sampling, overlength penalty shaping, and asymmetric clipping thresholds to prevent entropy collapse and mitigate length biases during training. The reward function is defined as the maximum of two signals: a deterministic rule-based match and a semantic judgment from a compact evaluator model. This hybrid approach avoids overfitting to rigid formats while maintaining answer correctness across varied notations and phrasings. Moreover, the framework is optimized via progressive context scaling, where the RL process transitions from 20K-token to 60K-token input lengths in controlled phases, stabilizing training dynamics and facilitating policy generalization. Experimental Results and Benchmark Performance QwenLong-L1 was evaluated on seven long-context document QA benchmarks, including DocMath, Frames, 2WikiMultihopQA, HotpotQA, Musique, NarrativeQA, and Qasper. The 32B variant, QwenLong-L1-32B, demonstrated strong empirical performance: It outperformed baseline models such as R1-Distill-Qwen-32B by 5.1 points and exceeded leading proprietary systems like OpenAI-o3-mini and Qwen3-235B-A22B. Its performance was comparable to Claude-3.7-Sonnet-Thinking, indicating competitive reasoning capabilities under extreme context lengths. Pass@K analysis revealed consistent improvements with increased sampling, achieving a Pass@2 average of 73.7, surpassing DeepSeek-R1 and OpenAI-o1-preview, even at low sampling rates. Ablation studies further validated the individual contributions of SFT, phased RL, and retrospective sampling. Notably, RL played a decisive role in enabling emergent reasoning behaviors such as grounding, subgoal setting, verification, and backtracking—traits not effectively induced by supervised fine-tuning alone. Conclusion QwenLong-L1 represents a systematic approach to equipping LRMs with robust long-context reasoning capabilities through reinforcement learning. Its design effectively bridges the gap between short-context expertise and the demands of information-dense environments by combining supervised initialization, curriculum-driven context scaling, and hybrid evaluation strategies. The framework not only achieves state-of-the-art results across long-context benchmarks but also demonstrates the emergence of interpretable reasoning patterns during training. Check out the Paper, Model on Hugging Face and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/NVIDIA Releases Llama Nemotron Nano 4B: An Efficient Open Reasoning Model Optimized for Edge AI and Scientific TasksAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Coding Implementation to Build an AI Agent with Live Python Execution and Automated ValidationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Step-by-Step Guide to Build a Customizable Multi-Tool AI Agent with LangGraph and Claude for Dynamic Agent CreationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Comprehensive Coding Guide to Crafting Advanced Round-Robin Multi-Agent Workflows with Microsoft AutoGen #qwen #researchers #proposes #qwenlongl1 #reinforcement
    WWW.MARKTECHPOST.COM
    Qwen Researchers Proposes QwenLong-L1: A Reinforcement Learning Framework for Long-Context Reasoning in Large Language Models
    While large reasoning models (LRMs) have shown impressive capabilities in short-context reasoning through reinforcement learning (RL), these gains do not generalize well to long-context scenarios. Applications such as multi-document QA, research synthesis, and legal or financial analysis require models to process and reason over sequences exceeding 100K tokens. However, RL optimization in such regimes is plagued by slower reward convergence, unstable policy updates due to KL divergence fluctuations, and reduced exploration resulting from entropy collapse. These bottlenecks reveal a fundamental gap in transitioning LRMs from short-context proficiency to long-context generalization. QwenLong-L1: A Structured RL Framework for Long-Context Adaptation To address these limitations, the Qwen Research team introduces QwenLong-L1, a novel RL framework designed to adapt LRMs to long-context reasoning tasks. The framework is structured into three key stages: Warm-up Supervised Fine-Tuning (SFT): Provides a stable initialization for the policy model by training on curated question-context-answer triplets, ensuring basic competence in contextual comprehension and answer extraction. Curriculum-Guided Phased Reinforcement Learning: Introduces a staged training process with gradually increasing context lengths. This progression enables the model to incrementally acquire long-context reasoning behaviors without destabilizing policy updates. Difficulty-Aware Retrospective Sampling: Enhances exploration by maintaining and reusing hard examples from previous phases, weighted by their difficulty, to encourage deeper reasoning and robustness across diverse inputs. These stages are complemented by hybrid reward mechanisms—combining rule-based exact match verification with semantic evaluation by a lightweight LLM—ensuring both precision and recall during policy training. Technical Design and Methodological Advantages QwenLong-L1 integrates recent advances in group-relative RL optimization, specifically GRPO and DAPO, to mitigate the computational overhead associated with long-context value estimation: GRPO estimates advantage by normalizing rewards within sampled groups, eliminating the need for a separate value network and encouraging diverse generation patterns. DAPO incorporates mechanisms such as dynamic sampling, overlength penalty shaping, and asymmetric clipping thresholds to prevent entropy collapse and mitigate length biases during training. The reward function is defined as the maximum of two signals: a deterministic rule-based match and a semantic judgment from a compact evaluator model (e.g., Qwen2.5-1.5B). This hybrid approach avoids overfitting to rigid formats while maintaining answer correctness across varied notations and phrasings. Moreover, the framework is optimized via progressive context scaling, where the RL process transitions from 20K-token to 60K-token input lengths in controlled phases, stabilizing training dynamics and facilitating policy generalization. Experimental Results and Benchmark Performance QwenLong-L1 was evaluated on seven long-context document QA benchmarks, including DocMath, Frames, 2WikiMultihopQA, HotpotQA, Musique, NarrativeQA, and Qasper. The 32B variant, QwenLong-L1-32B, demonstrated strong empirical performance: It outperformed baseline models such as R1-Distill-Qwen-32B by 5.1 points and exceeded leading proprietary systems like OpenAI-o3-mini and Qwen3-235B-A22B. Its performance was comparable to Claude-3.7-Sonnet-Thinking, indicating competitive reasoning capabilities under extreme context lengths. Pass@K analysis revealed consistent improvements with increased sampling, achieving a Pass@2 average of 73.7, surpassing DeepSeek-R1 and OpenAI-o1-preview, even at low sampling rates. Ablation studies further validated the individual contributions of SFT, phased RL, and retrospective sampling. Notably, RL played a decisive role in enabling emergent reasoning behaviors such as grounding, subgoal setting, verification, and backtracking—traits not effectively induced by supervised fine-tuning alone. Conclusion QwenLong-L1 represents a systematic approach to equipping LRMs with robust long-context reasoning capabilities through reinforcement learning. Its design effectively bridges the gap between short-context expertise and the demands of information-dense environments by combining supervised initialization, curriculum-driven context scaling, and hybrid evaluation strategies. The framework not only achieves state-of-the-art results across long-context benchmarks but also demonstrates the emergence of interpretable reasoning patterns during training. Check out the Paper, Model on Hugging Face and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/NVIDIA Releases Llama Nemotron Nano 4B: An Efficient Open Reasoning Model Optimized for Edge AI and Scientific TasksAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Coding Implementation to Build an AI Agent with Live Python Execution and Automated ValidationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Step-by-Step Guide to Build a Customizable Multi-Tool AI Agent with LangGraph and Claude for Dynamic Agent CreationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Comprehensive Coding Guide to Crafting Advanced Round-Robin Multi-Agent Workflows with Microsoft AutoGen
    4 Comentários 0 Compartilhamentos 0 Anterior
  • Entropy: Zero VR is an awesome Half-Life 2 campaign that puts players in the shoes of the villainous Combine

    You can trust VideoGamer. Our team of gaming experts spend hours testing and reviewing the latest games, to ensure you're reading the most comprehensive guide possible. Rest assured, all imagery and advice is unique and original. Check out how we test and review games here

    Have you ever wanted to be a Combine soldier in Half-Life 2? Well, there’s probably countless mods for that. However, new Half-Life 2: Episode 2 mod, Entropy: Zero VR puts players straight into the shoes of a Combine soldier for a new cinematic campaign.
    Awesome Half-Life 2 mod goes VR
    Released in 2017, the original Entropy Zero places players into the boots of a stranded Metrocop in the abandoned City 10. The well-received mod is set 11 months before Gordon Freeman fights through City 17 and tasks players with taking out Rebel forces.
    While the mod isn’t as polished as a true Valve experience—excitingly, we may actually be getting a new official Half-Life game soon—the mod is a great time. With a number of great set pieces and some fantastic combat, it’s a brilliant free addition to Half-Life 2: Episode 2.
    Now, eight years after the mod’s release, the team is releasing a full VR version of the mod, suitably titled Entropy: Zero VR. Built on the fan-made Half-Life 2 VR mod, the new virtual reality version adds “major gameplay and quality of life refinements”, new visual improvements, more content and side areas, and even new hands and voice lines.
    There’s no current release date for the VR version of the Half-Life 2: Episode 2, but the game will run on any SteamVR-capable headset.
    “The current plan is that EZ1VR will be available as a free DLC on Steam for Entropy : Zero, and then can be launched as a launch option on Steam,” the developers announced in a ModDB post. “Please keep an eye out for future articles because this may change before release.”
    While new content is being added to the game for the VR version, the team is also adding that content into the flat version of the game as well.

    Subscribe to our newsletters!

    By subscribing, you agree to our Privacy Policy and may receive occasional deal communications; you can unsubscribe anytime.

    Share
    #entropy #zero #awesome #halflife #campaign
    Entropy: Zero VR is an awesome Half-Life 2 campaign that puts players in the shoes of the villainous Combine
    You can trust VideoGamer. Our team of gaming experts spend hours testing and reviewing the latest games, to ensure you're reading the most comprehensive guide possible. Rest assured, all imagery and advice is unique and original. Check out how we test and review games here Have you ever wanted to be a Combine soldier in Half-Life 2? Well, there’s probably countless mods for that. However, new Half-Life 2: Episode 2 mod, Entropy: Zero VR puts players straight into the shoes of a Combine soldier for a new cinematic campaign. Awesome Half-Life 2 mod goes VR Released in 2017, the original Entropy Zero places players into the boots of a stranded Metrocop in the abandoned City 10. The well-received mod is set 11 months before Gordon Freeman fights through City 17 and tasks players with taking out Rebel forces. While the mod isn’t as polished as a true Valve experience—excitingly, we may actually be getting a new official Half-Life game soon—the mod is a great time. With a number of great set pieces and some fantastic combat, it’s a brilliant free addition to Half-Life 2: Episode 2. Now, eight years after the mod’s release, the team is releasing a full VR version of the mod, suitably titled Entropy: Zero VR. Built on the fan-made Half-Life 2 VR mod, the new virtual reality version adds “major gameplay and quality of life refinements”, new visual improvements, more content and side areas, and even new hands and voice lines. There’s no current release date for the VR version of the Half-Life 2: Episode 2, but the game will run on any SteamVR-capable headset. “The current plan is that EZ1VR will be available as a free DLC on Steam for Entropy : Zero, and then can be launched as a launch option on Steam,” the developers announced in a ModDB post. “Please keep an eye out for future articles because this may change before release.” While new content is being added to the game for the VR version, the team is also adding that content into the flat version of the game as well. Subscribe to our newsletters! By subscribing, you agree to our Privacy Policy and may receive occasional deal communications; you can unsubscribe anytime. Share #entropy #zero #awesome #halflife #campaign
    WWW.VIDEOGAMER.COM
    Entropy: Zero VR is an awesome Half-Life 2 campaign that puts players in the shoes of the villainous Combine
    You can trust VideoGamer. Our team of gaming experts spend hours testing and reviewing the latest games, to ensure you're reading the most comprehensive guide possible. Rest assured, all imagery and advice is unique and original. Check out how we test and review games here Have you ever wanted to be a Combine soldier in Half-Life 2? Well, there’s probably countless mods for that. However, new Half-Life 2: Episode 2 mod, Entropy: Zero VR puts players straight into the shoes of a Combine soldier for a new cinematic campaign. Awesome Half-Life 2 mod goes VR Released in 2017, the original Entropy Zero places players into the boots of a stranded Metrocop in the abandoned City 10. The well-received mod is set 11 months before Gordon Freeman fights through City 17 and tasks players with taking out Rebel forces. While the mod isn’t as polished as a true Valve experience—excitingly, we may actually be getting a new official Half-Life game soon—the mod is a great time. With a number of great set pieces and some fantastic combat, it’s a brilliant free addition to Half-Life 2: Episode 2. Now, eight years after the mod’s release, the team is releasing a full VR version of the mod, suitably titled Entropy: Zero VR. Built on the fan-made Half-Life 2 VR mod, the new virtual reality version adds “major gameplay and quality of life refinements”, new visual improvements, more content and side areas, and even new hands and voice lines. There’s no current release date for the VR version of the Half-Life 2: Episode 2, but the game will run on any SteamVR-capable headset. “The current plan is that EZ1VR will be available as a free DLC on Steam for Entropy : Zero, and then can be launched as a launch option on Steam,” the developers announced in a ModDB post. “Please keep an eye out for future articles because this may change before release.” While new content is being added to the game for the VR version, the team is also adding that content into the flat version of the game as well. Subscribe to our newsletters! By subscribing, you agree to our Privacy Policy and may receive occasional deal communications; you can unsubscribe anytime. Share
    0 Comentários 0 Compartilhamentos 0 Anterior
  • A Step-by-Step Coding Guide to Efficiently Fine-Tune Qwen3-14B Using Unsloth AI on Google Colab with Mixed Datasets and LoRA Optimization

    Fine-tuning LLMs often requires extensive resources, time, and memory, challenges that can hinder rapid experimentation and deployment. Unsloth AI revolutionizes this process by enabling fast, efficient fine-tuning state-of-the-art models like Qwen3-14B with minimal GPU memory, leveraging advanced techniques such as 4-bit quantization and LoRA. In this tutorial, we walk through a practical implementation on Google Colab to fine-tune Qwen3-14B using a combination of reasoning and instruction-following datasets, combining Unsloth’s FastLanguageModel utilities with trl.SFTTrainer users can achieve powerful fine-tuning performance with just consumer-grade hardware.
    %%capture
    import os
    if "COLAB_" not in "".join):
    !pip install unsloth
    else:
    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer
    !pip install --no-deps unsloth
    We install all the essential libraries required for fine-tuning the Qwen3 model using Unsloth AI. It conditionally installs dependencies based on the environment, using a lightweight approach on Colab to ensure compatibility and reduce overhead. Key components like bitsandbytes, trl, xformers, and unsloth_zoo are included to enable 4-bit quantized training and LoRA-based optimization.
    from unsloth import FastLanguageModel
    import torch

    model, tokenizer = FastLanguageModel.from_pretrainedWe load the Qwen3-14B model using FastLanguageModel from the Unsloth library, which is optimized for efficient fine-tuning. It initializes the model with a context length of 2048 tokens and loads it in 4-bit precision, significantly reducing memory usage. Full fine-tuning is disabled, making it suitable for lightweight parameter-efficient techniques like LoRA.
    model = FastLanguageModel.get_peft_modelWe apply LoRAto the Qwen3 model using FastLanguageModel.get_peft_model. It injects trainable adapters into specific transformer layerswith a rank of 32, enabling efficient fine-tuning while keeping most model weights frozen. Using “unsloth” gradient checkpointing further optimizes memory usage, making it suitable for training large models on limited hardware.
    from datasets import load_dataset

    reasoning_dataset = load_datasetnon_reasoning_dataset = load_datasetWe load two pre-curated datasets from the Hugging Face Hub using the library. The reasoning_dataset contains chain-of-thoughtproblems from Unsloth’s OpenMathReasoning-mini, designed to enhance logical reasoning in the model. The non_reasoning_dataset pulls general instruction-following data from mlabonne’s FineTome-100k, which helps the model learn broader conversational and task-oriented skills. Together, these datasets support a well-rounded fine-tuning objective.
    def generate_conversation:
    problems = examplessolutions = examplesconversations =for problem, solution in zip:
    conversations.appendreturn {"conversations": conversations}
    This function, generate_conversation, transforms raw question–answer pairs from the reasoning dataset into a chat-style format suitable for fine-tuning. For each problem and its corresponding generated solution, a conversation is conducted in which the user asks a question and the assistant provides the answer. The output is a list of dictionaries following the structure expected by chat-based language models, preparing the data for tokenization with a chat template.
    reasoning_conversations = tokenizer.apply_chat_templatefrom unsloth.chat_templates import standardize_sharegpt
    dataset = standardize_sharegptnon_reasoning_conversations = tokenizer.apply_chat_templateimport pandas as pd

    chat_percentage = 0.75
    non_reasoning_subset = pd.Series.sample*),
    random_state=2407,
    )

    data = pd.concat,
    pd.Series])
    data.name = "text"
    We prepare the fine-tuning dataset by converting the reasoning and instruction datasets into a consistent chat format and then combining them. It first applies the tokenizer’s apply_chat_template to convert structured conversations into tokenizable strings. The standardize_sharegpt function normalizes the instruction dataset into a compatible structure. Then, a 75-25 mix is created by sampling 25% of the non-reasoningconversations and combining them with the reasoning data. This blend ensures the model is exposed to logical reasoning and general instruction-following tasks, improving its versatility during training. The final combined data is stored as a single-column Pandas Series named “text”.
    from datasets import Dataset

    combined_dataset = Dataset.from_pandas)
    combined_dataset = combined_dataset.shufflefrom trl import SFTTrainer, SFTConfig

    trainer = SFTTrainer)

    We take the preprocessed conversations, wrap them into a Hugging Face Dataset, and shuffle the dataset with a fixed seed for reproducibility. Then, the fine-tuning trainer is initialized using trl’s SFTTrainer and SFTConfig. The trainer is set up to use the combined datasetand defines training hyperparameters like batch size, gradient accumulation, number of warmup and training steps, learning rate, optimizer parameters, and a linear learning rate scheduler. This configuration is geared towards efficient fine-tuning while maintaining reproducibility and logging minimal details.
    trainer.traintrainer.trainstarts the fine-tuning process for the Qwen3-14B model using the SFTTrainer. It trains the model on the prepared mixed dataset of reasoning and instruction-following conversations, optimizing only the LoRA-adapted parameters thanks to the underlying Unsloth setup. Training will proceed according to the configuration specified earlier, and progress will be printed every logging step. This final command launches the actual model adaptation based on your custom data.
    model.save_pretrainedtokenizer.save_pretrainedWe save the fine-tuned model and tokenizer locally to the “qwen3-finetuned-colab” directory. By calling save_pretrained, the adapted weights and tokenizer configuration can be reloaded later for inference or further training, locally or for uploading to the Hugging Face Hub.
    In conclusion, with the help of Unsloth AI, fine-tuning massive LLMs like Qwen3-14B becomes feasible, using limited resources, and is highly efficient and accessible. This tutorial demonstrated how to load a 4-bit quantized version of the model, apply structured chat templates, mix multiple datasets for better generalization, and train using TRL’s SFTTrainer. Whether you’re building custom assistants or specialized domain models, Unsloth’s tools dramatically reduce the barrier to fine-tuning at scale. As open-source fine-tuning ecosystems evolve, Unsloth continues to lead the way in making LLM training faster, cheaper, and more practical for everyone.

    Check out the COLAB NOTEBOOK. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.
    Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/Chain-of-Thought May Not Be a Window into AI’s Reasoning: Anthropic’s New Study Reveals Hidden GapsAsif Razzaqhttps://www.marktechpost.com/author/6flvq/How to Build a Powerful and Intelligent Question-Answering System by Using Tavily Search API, Chroma, Google Gemini LLMs, and the LangChain FrameworkAsif Razzaqhttps://www.marktechpost.com/author/6flvq/AWS Open-Sources Strands Agents SDK to Simplify AI Agent DevelopmentAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Windsurf Launches SWE-1: A Frontier AI Model Family for End-to-End Software Engineering

    Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub!
    #stepbystep #coding #guide #efficiently #finetune
    A Step-by-Step Coding Guide to Efficiently Fine-Tune Qwen3-14B Using Unsloth AI on Google Colab with Mixed Datasets and LoRA Optimization
    Fine-tuning LLMs often requires extensive resources, time, and memory, challenges that can hinder rapid experimentation and deployment. Unsloth AI revolutionizes this process by enabling fast, efficient fine-tuning state-of-the-art models like Qwen3-14B with minimal GPU memory, leveraging advanced techniques such as 4-bit quantization and LoRA. In this tutorial, we walk through a practical implementation on Google Colab to fine-tune Qwen3-14B using a combination of reasoning and instruction-following datasets, combining Unsloth’s FastLanguageModel utilities with trl.SFTTrainer users can achieve powerful fine-tuning performance with just consumer-grade hardware. %%capture import os if "COLAB_" not in "".join): !pip install unsloth else: !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo !pip install sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer !pip install --no-deps unsloth We install all the essential libraries required for fine-tuning the Qwen3 model using Unsloth AI. It conditionally installs dependencies based on the environment, using a lightweight approach on Colab to ensure compatibility and reduce overhead. Key components like bitsandbytes, trl, xformers, and unsloth_zoo are included to enable 4-bit quantized training and LoRA-based optimization. from unsloth import FastLanguageModel import torch model, tokenizer = FastLanguageModel.from_pretrainedWe load the Qwen3-14B model using FastLanguageModel from the Unsloth library, which is optimized for efficient fine-tuning. It initializes the model with a context length of 2048 tokens and loads it in 4-bit precision, significantly reducing memory usage. Full fine-tuning is disabled, making it suitable for lightweight parameter-efficient techniques like LoRA. model = FastLanguageModel.get_peft_modelWe apply LoRAto the Qwen3 model using FastLanguageModel.get_peft_model. It injects trainable adapters into specific transformer layerswith a rank of 32, enabling efficient fine-tuning while keeping most model weights frozen. Using “unsloth” gradient checkpointing further optimizes memory usage, making it suitable for training large models on limited hardware. from datasets import load_dataset reasoning_dataset = load_datasetnon_reasoning_dataset = load_datasetWe load two pre-curated datasets from the Hugging Face Hub using the library. The reasoning_dataset contains chain-of-thoughtproblems from Unsloth’s OpenMathReasoning-mini, designed to enhance logical reasoning in the model. The non_reasoning_dataset pulls general instruction-following data from mlabonne’s FineTome-100k, which helps the model learn broader conversational and task-oriented skills. Together, these datasets support a well-rounded fine-tuning objective. def generate_conversation: problems = examplessolutions = examplesconversations =for problem, solution in zip: conversations.appendreturn {"conversations": conversations} This function, generate_conversation, transforms raw question–answer pairs from the reasoning dataset into a chat-style format suitable for fine-tuning. For each problem and its corresponding generated solution, a conversation is conducted in which the user asks a question and the assistant provides the answer. The output is a list of dictionaries following the structure expected by chat-based language models, preparing the data for tokenization with a chat template. reasoning_conversations = tokenizer.apply_chat_templatefrom unsloth.chat_templates import standardize_sharegpt dataset = standardize_sharegptnon_reasoning_conversations = tokenizer.apply_chat_templateimport pandas as pd chat_percentage = 0.75 non_reasoning_subset = pd.Series.sample*), random_state=2407, ) data = pd.concat, pd.Series]) data.name = "text" We prepare the fine-tuning dataset by converting the reasoning and instruction datasets into a consistent chat format and then combining them. It first applies the tokenizer’s apply_chat_template to convert structured conversations into tokenizable strings. The standardize_sharegpt function normalizes the instruction dataset into a compatible structure. Then, a 75-25 mix is created by sampling 25% of the non-reasoningconversations and combining them with the reasoning data. This blend ensures the model is exposed to logical reasoning and general instruction-following tasks, improving its versatility during training. The final combined data is stored as a single-column Pandas Series named “text”. from datasets import Dataset combined_dataset = Dataset.from_pandas) combined_dataset = combined_dataset.shufflefrom trl import SFTTrainer, SFTConfig trainer = SFTTrainer) We take the preprocessed conversations, wrap them into a Hugging Face Dataset, and shuffle the dataset with a fixed seed for reproducibility. Then, the fine-tuning trainer is initialized using trl’s SFTTrainer and SFTConfig. The trainer is set up to use the combined datasetand defines training hyperparameters like batch size, gradient accumulation, number of warmup and training steps, learning rate, optimizer parameters, and a linear learning rate scheduler. This configuration is geared towards efficient fine-tuning while maintaining reproducibility and logging minimal details. trainer.traintrainer.trainstarts the fine-tuning process for the Qwen3-14B model using the SFTTrainer. It trains the model on the prepared mixed dataset of reasoning and instruction-following conversations, optimizing only the LoRA-adapted parameters thanks to the underlying Unsloth setup. Training will proceed according to the configuration specified earlier, and progress will be printed every logging step. This final command launches the actual model adaptation based on your custom data. model.save_pretrainedtokenizer.save_pretrainedWe save the fine-tuned model and tokenizer locally to the “qwen3-finetuned-colab” directory. By calling save_pretrained, the adapted weights and tokenizer configuration can be reloaded later for inference or further training, locally or for uploading to the Hugging Face Hub. In conclusion, with the help of Unsloth AI, fine-tuning massive LLMs like Qwen3-14B becomes feasible, using limited resources, and is highly efficient and accessible. This tutorial demonstrated how to load a 4-bit quantized version of the model, apply structured chat templates, mix multiple datasets for better generalization, and train using TRL’s SFTTrainer. Whether you’re building custom assistants or specialized domain models, Unsloth’s tools dramatically reduce the barrier to fine-tuning at scale. As open-source fine-tuning ecosystems evolve, Unsloth continues to lead the way in making LLM training faster, cheaper, and more practical for everyone. Check out the COLAB NOTEBOOK. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/Chain-of-Thought May Not Be a Window into AI’s Reasoning: Anthropic’s New Study Reveals Hidden GapsAsif Razzaqhttps://www.marktechpost.com/author/6flvq/How to Build a Powerful and Intelligent Question-Answering System by Using Tavily Search API, Chroma, Google Gemini LLMs, and the LangChain FrameworkAsif Razzaqhttps://www.marktechpost.com/author/6flvq/AWS Open-Sources Strands Agents SDK to Simplify AI Agent DevelopmentAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Windsurf Launches SWE-1: A Frontier AI Model Family for End-to-End Software Engineering 🚨 Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub! #stepbystep #coding #guide #efficiently #finetune
    WWW.MARKTECHPOST.COM
    A Step-by-Step Coding Guide to Efficiently Fine-Tune Qwen3-14B Using Unsloth AI on Google Colab with Mixed Datasets and LoRA Optimization
    Fine-tuning LLMs often requires extensive resources, time, and memory, challenges that can hinder rapid experimentation and deployment. Unsloth AI revolutionizes this process by enabling fast, efficient fine-tuning state-of-the-art models like Qwen3-14B with minimal GPU memory, leveraging advanced techniques such as 4-bit quantization and LoRA (Low-Rank Adaptation). In this tutorial, we walk through a practical implementation on Google Colab to fine-tune Qwen3-14B using a combination of reasoning and instruction-following datasets, combining Unsloth’s FastLanguageModel utilities with trl.SFTTrainer users can achieve powerful fine-tuning performance with just consumer-grade hardware. %%capture import os if "COLAB_" not in "".join(os.environ.keys()): !pip install unsloth else: !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo !pip install sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer !pip install --no-deps unsloth We install all the essential libraries required for fine-tuning the Qwen3 model using Unsloth AI. It conditionally installs dependencies based on the environment, using a lightweight approach on Colab to ensure compatibility and reduce overhead. Key components like bitsandbytes, trl, xformers, and unsloth_zoo are included to enable 4-bit quantized training and LoRA-based optimization. from unsloth import FastLanguageModel import torch model, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/Qwen3-14B", max_seq_length = 2048, load_in_4bit = True, load_in_8bit = False, full_finetuning = False, ) We load the Qwen3-14B model using FastLanguageModel from the Unsloth library, which is optimized for efficient fine-tuning. It initializes the model with a context length of 2048 tokens and loads it in 4-bit precision, significantly reducing memory usage. Full fine-tuning is disabled, making it suitable for lightweight parameter-efficient techniques like LoRA. model = FastLanguageModel.get_peft_model( model, r = 32, target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], lora_alpha = 32, lora_dropout = 0, bias = "none", use_gradient_checkpointing = "unsloth", random_state = 3407, use_rslora = False, loftq_config = None, ) We apply LoRA (Low-Rank Adaptation) to the Qwen3 model using FastLanguageModel.get_peft_model. It injects trainable adapters into specific transformer layers (like q_proj, v_proj, etc.) with a rank of 32, enabling efficient fine-tuning while keeping most model weights frozen. Using “unsloth” gradient checkpointing further optimizes memory usage, making it suitable for training large models on limited hardware. from datasets import load_dataset reasoning_dataset = load_dataset("unsloth/OpenMathReasoning-mini", split="cot") non_reasoning_dataset = load_dataset("mlabonne/FineTome-100k", split="train") We load two pre-curated datasets from the Hugging Face Hub using the library. The reasoning_dataset contains chain-of-thought (CoT) problems from Unsloth’s OpenMathReasoning-mini, designed to enhance logical reasoning in the model. The non_reasoning_dataset pulls general instruction-following data from mlabonne’s FineTome-100k, which helps the model learn broader conversational and task-oriented skills. Together, these datasets support a well-rounded fine-tuning objective. def generate_conversation(examples): problems = examples["problem"] solutions = examples["generated_solution"] conversations = [] for problem, solution in zip(problems, solutions): conversations.append([ {"role": "user", "content": problem}, {"role": "assistant", "content": solution}, ]) return {"conversations": conversations} This function, generate_conversation, transforms raw question–answer pairs from the reasoning dataset into a chat-style format suitable for fine-tuning. For each problem and its corresponding generated solution, a conversation is conducted in which the user asks a question and the assistant provides the answer. The output is a list of dictionaries following the structure expected by chat-based language models, preparing the data for tokenization with a chat template. reasoning_conversations = tokenizer.apply_chat_template( reasoning_dataset["conversations"], tokenize=False, ) from unsloth.chat_templates import standardize_sharegpt dataset = standardize_sharegpt(non_reasoning_dataset) non_reasoning_conversations = tokenizer.apply_chat_template( dataset["conversations"], tokenize=False, ) import pandas as pd chat_percentage = 0.75 non_reasoning_subset = pd.Series(non_reasoning_conversations).sample( int(len(reasoning_conversations) * (1.0 - chat_percentage)), random_state=2407, ) data = pd.concat([ pd.Series(reasoning_conversations), pd.Series(non_reasoning_subset) ]) data.name = "text" We prepare the fine-tuning dataset by converting the reasoning and instruction datasets into a consistent chat format and then combining them. It first applies the tokenizer’s apply_chat_template to convert structured conversations into tokenizable strings. The standardize_sharegpt function normalizes the instruction dataset into a compatible structure. Then, a 75-25 mix is created by sampling 25% of the non-reasoning (instruction) conversations and combining them with the reasoning data. This blend ensures the model is exposed to logical reasoning and general instruction-following tasks, improving its versatility during training. The final combined data is stored as a single-column Pandas Series named “text”. from datasets import Dataset combined_dataset = Dataset.from_pandas(pd.DataFrame(data)) combined_dataset = combined_dataset.shuffle(seed=3407) from trl import SFTTrainer, SFTConfig trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=combined_dataset, eval_dataset=None, args=SFTConfig( dataset_text_field="text", per_device_train_batch_size=2, gradient_accumulation_steps=4, warmup_steps=5, max_steps=30, learning_rate=2e-4, logging_steps=1, optim="adamw_8bit", weight_decay=0.01, lr_scheduler_type="linear", seed=3407, report_to="none", ) ) We take the preprocessed conversations, wrap them into a Hugging Face Dataset (ensuring the data is in a consistent format), and shuffle the dataset with a fixed seed for reproducibility. Then, the fine-tuning trainer is initialized using trl’s SFTTrainer and SFTConfig. The trainer is set up to use the combined dataset (with the text column field named “text”) and defines training hyperparameters like batch size, gradient accumulation, number of warmup and training steps, learning rate, optimizer parameters, and a linear learning rate scheduler. This configuration is geared towards efficient fine-tuning while maintaining reproducibility and logging minimal details (with report_to=”none”). trainer.train() trainer.train() starts the fine-tuning process for the Qwen3-14B model using the SFTTrainer. It trains the model on the prepared mixed dataset of reasoning and instruction-following conversations, optimizing only the LoRA-adapted parameters thanks to the underlying Unsloth setup. Training will proceed according to the configuration specified earlier (e.g., max_steps=30, batch_size=2, lr=2e-4), and progress will be printed every logging step. This final command launches the actual model adaptation based on your custom data. model.save_pretrained("qwen3-finetuned-colab") tokenizer.save_pretrained("qwen3-finetuned-colab") We save the fine-tuned model and tokenizer locally to the “qwen3-finetuned-colab” directory. By calling save_pretrained(), the adapted weights and tokenizer configuration can be reloaded later for inference or further training, locally or for uploading to the Hugging Face Hub. In conclusion, with the help of Unsloth AI, fine-tuning massive LLMs like Qwen3-14B becomes feasible, using limited resources, and is highly efficient and accessible. This tutorial demonstrated how to load a 4-bit quantized version of the model, apply structured chat templates, mix multiple datasets for better generalization, and train using TRL’s SFTTrainer. Whether you’re building custom assistants or specialized domain models, Unsloth’s tools dramatically reduce the barrier to fine-tuning at scale. As open-source fine-tuning ecosystems evolve, Unsloth continues to lead the way in making LLM training faster, cheaper, and more practical for everyone. Check out the COLAB NOTEBOOK. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter. Asif RazzaqWebsite |  + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/Chain-of-Thought May Not Be a Window into AI’s Reasoning: Anthropic’s New Study Reveals Hidden GapsAsif Razzaqhttps://www.marktechpost.com/author/6flvq/How to Build a Powerful and Intelligent Question-Answering System by Using Tavily Search API, Chroma, Google Gemini LLMs, and the LangChain FrameworkAsif Razzaqhttps://www.marktechpost.com/author/6flvq/AWS Open-Sources Strands Agents SDK to Simplify AI Agent DevelopmentAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Windsurf Launches SWE-1: A Frontier AI Model Family for End-to-End Software Engineering 🚨 Build GenAI you can trust. ⭐️ Parlant is your open-source engine for controlled, compliant, and purposeful AI conversations — Star Parlant on GitHub! (Promoted)
    0 Comentários 0 Compartilhamentos 0 Anterior
  • Lessons in Decision Making from the Monty Hall Problem

    The Monty Hall Problem is a well-known brain teaser from which we can learn important lessons in Decision Making that are useful in general and in particular for data scientists.

    If you are not familiar with this problem, prepare to be perplexed . If you are, I hope to shine light on aspects that you might not have considered .

    I introduce the problem and solve with three types of intuitions:

    Common — The heart of this post focuses on applying our common sense to solve this problem. We’ll explore why it fails us and what we can do to intuitively overcome this to make the solution crystal clear . We’ll do this by using visuals , qualitative arguments and some basic probabilities.

    Bayesian — We will briefly discuss the importance of belief propagation.

    Causal — We will use a Graph Model to visualise conditions required to use the Monty Hall problem in real world settings.Spoiler alert I haven’t been convinced that there are any, but the thought process is very useful.

    I summarise by discussing lessons learnt for better data decision making.

    In regards to the Bayesian and Causal intuitions, these will be presented in a gentle form. For the mathematically inclined I also provide supplementary sections with short Deep Dives into each approach after the summary.By examining different aspects of this puzzle in probability you will hopefully be able to improve your data decision making .

    Credit: Wikipedia

    First, some history. Let’s Make a Deal is a USA television game show that originated in 1963. As its premise, audience participants were considered traders making deals with the host, Monty Hall .

    At the heart of the matter is an apparently simple scenario:

    A trader is posed with the question of choosing one of three doors for the opportunity to win a luxurious prize, e.g, a car . Behind the other two were goats .

    The trader is shown three closed doors.

    The trader chooses one of the doors. Let’s call thisdoor A and mark it with a .

    Keeping the chosen door closed, the host reveals one of the remaining doors showing a goat.

    The trader chooses door and the the host reveals door C showing a goat.

    The host then asks the trader if they would like to stick with their first choice or switch to the other remaining one.

    If the trader guesses correct they win the prize . If not they’ll be shown another goat.

    What is the probability of being Zonked? Credit: Wikipedia

    Should the trader stick with their original choice of door A or switch to B?

    Before reading further, give it a go. What would you do?

    Most people are likely to have a gut intuition that “it doesn’t matter” arguing that in the first instance each door had a ⅓ chance of hiding the prize, and that after the host intervention , when only two doors remain closed, the winning of the prize is 50:50.

    There are various ways of explaining why the coin toss intuition is incorrect. Most of these involve maths equations, or simulations. Whereas we will address these later, we’ll attempt to solve by applying Occam’s razor:

    A principle that states that simpler explanations are preferable to more complex ones — William of OckhamTo do this it is instructive to slightly redefine the problem to a large N doors instead of the original three.

    The Large N-Door Problem

    Similar to before: you have to choose one of many doors. For illustration let’s say N=100. Behind one of the doors there is the prize and behind 99of the rest are goats .

    The 100 Door Monty Hall problem before the host intervention.

    You choose one door and the host reveals 98of the other doors that have goats leaving yours and one more closed .

    The 100 Door Monty Hall Problem after the host intervention. Should you stick with your door or make the switch?

    Should you stick with your original choice or make the switch?

    I think you’ll agree with me that the remaining door, not chosen by you, is much more likely to conceal the prize … so you should definitely make the switch!

    It’s illustrative to compare both scenarios discussed so far. In the next figure we compare the post host intervention for the N=3 setupand that of N=100:

    Post intervention settings for the N=3 setupand N=100.

    In both cases we see two shut doors, one of which we’ve chosen. The main difference between these scenarios is that in the first we see one goat and in the second there are more than the eye would care to see.

    Why do most people consider the first case as a “50:50” toss up and in the second it’s obvious to make the switch?

    We’ll soon address this question of why. First let’s put probabilities of success behind the different scenarios.

    What’s The Frequency, Kenneth?

    So far we learnt from the N=100 scenario that switching doors is obviously beneficial. Inferring for the N=3 may be a leap of faith for most. Using some basic probability arguments here we’ll quantify why it is favourable to make the switch for any number door scenario N.

    We start with the standard Monty Hall problem. When it starts the probability of the prize being behind each of the doors A, B and C is p=⅓. To be explicit let’s define the Y parameter to be the door with the prize , i.e, p= p=p=⅓.

    The trick to solving this problem is that once the trader’s door A has been chosen , we should pay close attention to the set of the other doors {B,C}, which has the probability of p=p+p=⅔. This visual may help make sense of this:

    By being attentive to the {B,C} the rest should follow. When the goat is revealed

    it is apparent that the probabilities post intervention change. Note that for ease of reading I’ll drop the Y notation, where pwill read pand pwill read p. Also for completeness the full terms after the intervention should be even longer due to it being conditional, e.g, p, p, where Z is a parameter representing the choice of the host .premains ⅓

    p=p+premains ⅔,

    p=0; we just learnt that the goat is behind door C, not the prize.

    p= p-p= ⅔

    For anyone with the information provided by the hostthis means that it isn’t a toss of a fair coin! For them the fact that pbecame zero does not “raise all other boats”, but rather premains the same and pgets doubled.

    The bottom line is that the trader should consider p= ⅓ and p=⅔, hence by switching they are doubling the odds at winning!

    Let’s generalise to N.

    When we start all doors have odds of winning the prize p=1/N. After the trader chooses one door which we’ll call D₁, meaning p=1/N, we should now pay attention to the remaining set of doors {D₂, …, Dₙ} will have a chance of p=/N.

    When the host revealsdoors {D₃, …, Dₙ} with goats:

    premains 1/N

    p=p+p+… + premains/N

    p=p= …=p=p= 0; we just learnt that they have goats, not the prize.

    p=p— p— … — p=/N

    The trader should now consider two door values p=1/N and p=/N.

    Hence the odds of winning improved by a factor of N-1! In the case of N=100, this means by an odds ratio of 99!.

    The improvement of odds ratios in all scenarios between N=3 to 100 may be seen in the following graph. The thin line is the probability of winning by choosing any door prior to the intervention p=1/N. Note that it also represents the chance of winning after the intervention, if they decide to stick to their guns and not switch p.The thick line is the probability of winning the prize after the intervention if the door is switched p=/N:

    Probability of winning as a function of N. p=p=1/N is the thin line; p=N/is the thick one.Perhaps the most interesting aspect of this graphis that the N=3 case has the highest probability before the host intervention , but the lowest probability after and vice versa for N=100.

    Another interesting feature is the quick climb in the probability of winning for the switchers:

    N=3: p=67%

    N=4: p=75%

    N=5=80%

    The switchers curve gradually reaches an asymptote approaching at 100% whereas at N=99 it is 98.99% and at N=100 is equal to 99%.

    This starts to address an interesting question:

    Why Is Switching Obvious For Large N But Not N=3?

    The answer is the fact that this puzzle is slightly ambiguous. Only the highly attentive realise that by revealing the goatthe host is actually conveying a lot of information that should be incorporated into one’s calculation. Later we discuss the difference of doing this calculation in one’s mind based on intuition and slowing down by putting pen to paper or coding up the problem.

    How much information is conveyed by the host by intervening?

    A hand wavy explanation is that this information may be visualised as the gap between the lines in the graph above. For N=3 we saw that the odds of winning doubled, but that doesn’t register as strongly to our common sense intuition as the 99 factor as in the N=100.

    I have also considered describing stronger arguments from Information Theory that provide useful vocabulary to express communication of information. However, I feel that this fascinating field deserves a post of its own, which I’ve published.

    The main takeaway for the Monty Hall problem is that I have calculated the information gain to be a logarithmic function of the number of doors c using this formula:

    Information Gain due to the intervention of the host for a setup with c doors. Full details in my upcoming article.

    For c=3 door case, e.g, the information gain is ⅔ bits. Full details are in this article on entropy.

    To summarise this section, we use basic probability arguments to quantify the probabilities of winning the prize showing the benefit of switching for all N door scenarios. For those interested in more formal solutions using Bayesian and Causality on the bottom I provide supplement sections.

    In the next three final sections we’ll discuss how this problem was accepted in the general public back in the 1990s, discuss lessons learnt and then summarise how we can apply them in real-world settings.

    Being Confused Is OK

    “No, that is impossible, it should make no difference.” — Paul Erdős

    If you still don’t feel comfortable with the solution of the N=3 Monty Hall problem, don’t worry you are in good company! According to Vazsonyi¹ even Paul Erdős who is considered “of the greatest experts in probability theory” was confounded until computer simulations were demonstrated to him.

    When the original solution by Steve Selvin² was popularised by Marilyn vos Savant in her column “Ask Marilyn” in Parade magazine in 1990 many readers wrote that Selvin and Savant were wrong³. According to Tierney’s 1991 article in the New York Times, this included about 10,000 readers, including nearly 1,000 with Ph.D degrees⁴.

    On a personal note, over a decade ago I was exposed to the standard N=3 problem and since then managed to forget the solution numerous times. When I learnt about the large N approach I was quite excited about how intuitive it was. I then failed to explain it to my technical manager over lunch, so this is an attempt to compensate. I still have the same day job .

    While researching this piece I realised that there is a lot to learn in terms of decision making in general and in particular useful for data science.

    Lessons Learnt From Monty Hall Problem

    In his book Thinking Fast and Slow, the late Daniel Kahneman, the co-creator of Behaviour Economics, suggested that we have two types of thought processes:

    System 1 — fast thinking : based on intuition. This helps us react fast with confidence to familiar situations.

    System 2 – slow thinking : based on deep thought. This helps figure out new complex situations that life throws at us.

    Assuming this premise, you might have noticed that in the above you were applying both.

    By examining the visual of N=100 doors your System 1 kicked in and you immediately knew the answer. I’m guessing that in the N=3 you were straddling between System 1 and 2. Considering that you had to stop and think a bit when going throughout the probabilities exercise it was definitely System 2 .

    The decision maker’s struggle between System 1 and System 2 . Generated using Gemini Imagen 3

    Beyond the fast and slow thinking I feel that there are a lot of data decision making lessons that may be learnt.Assessing probabilities can be counter-intuitive …

    or

    Be comfortable with shifting to deep thought

    We’ve clearly shown that in the N=3 case. As previously mentioned it confounded many people including prominent statisticians.

    Another classic example is The Birthday Paradox , which shows how we underestimate the likelihood of coincidences. In this problem most people would think that one needs a large group of people until they find a pair sharing the same birthday. It turns out that all you need is 23 to have a 50% chance. And 70 for a 99.9% chance.

    One of the most confusing paradoxes in the realm of data analysis is Simpson’s, which I detailed in a previous article. This is a situation where trends of a population may be reversed in its subpopulations.

    The common with all these paradoxes is them requiring us to get comfortable to shifting gears from System 1 fast thinking to System 2 slow . This is also the common theme for the lessons outlined below.

    A few more classical examples are: The Gambler’s Fallacy , Base Rate Fallacy and the The LindaProblem . These are beyond the scope of this article, but I highly recommend looking them up to further sharpen ways of thinking about data.… especially when dealing with ambiguity

    or

    Search for clarity in ambiguity

    Let’s reread the problem, this time as stated in “Ask Marilyn”

    Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say №1, and the host, who knows what’s behind the doors, opens another door, say №3, which has a goat. He then says to you, “Do you want to pick door №2?” Is it to your advantage to switch your choice?

    We discussed that the most important piece of information is not made explicit. It says that the host “knows what’s behind the doors”, but not that they open a door at random, although it’s implicitly understood that the host will never open the door with the car.

    Many real life problems in data science involve dealing with ambiguous demands as well as in data provided by stakeholders.

    It is crucial for the researcher to track down any relevant piece of information that is likely to have an impact and update that into the solution. Statisticians refer to this as “belief update”.With new information we should update our beliefs

    This is the main aspect separating the Bayesian stream of thought to the Frequentist. The Frequentist approach takes data at face value. The Bayesian approach incorporates prior beliefs and updates it when new findings are introduced. This is especially useful when dealing with ambiguous situations.

    To drive this point home, let’s re-examine this figure comparing between the post intervention N=3 setupsand the N=100 one.

    Copied from above. Post intervention settings for the N=3 setupand N=100.

    In both cases we had a prior belief that all doors had an equal chance of winning the prize p=1/N.

    Once the host opened one doora lot of valuable information was revealed whereas in the case of N=100 it was much more apparent than N=3.

    In the Frequentist approach, however, most of this information would be ignored, as it only focuses on the two closed doors. The Frequentist conclusion, hence is a 50% chance to win the prize regardless of what else is known about the situation. Hence the Frequentist takes Paul Erdős’ “no difference” point of view, which we now know to be incorrect.

    This would be reasonable if all that was presented were the two doors and not the intervention and the goats. However, if that information is presented, one should shift gears into System 2 thinking and update their beliefs in the system. This is what we have done by focusing not only on the shut door, but rather consider what was learnt about the system at large.

    For the brave hearted , in a supplementary section below called The Bayesian Point of View I solve for the Monty Hall problem using the Bayesian formalism.Be one with subjectivity

    The Frequentist main reservation about “going Bayes” is that — “Statistics should be objective”.

    The Bayesian response is — the Frequentist’s also apply a prior without realising it — a flat one.

    Regardless of the Bayesian/Frequentist debate, as researchers we try our best to be as objective as possible in every step of the analysis.

    That said, it is inevitable that subjective decisions are made throughout.

    E.g, in a skewed distribution should one quote the mean or median? It highly depends on the context and hence a subjective decision needs to be made.

    The responsibility of the analyst is to provide justification for their choices first to convince themselves and then their stakeholders.When confused — look for a useful analogy

    … but tread with caution

    We saw that by going from the N=3 setup to the N=100 the solution was apparent. This is a trick scientists frequently use — if the problem appears at first a bit too confusing/overwhelming, break it down and try to find a useful analogy.

    It is probably not a perfect comparison, but going from the N=3 setup to N=100 is like examining a picture from up close and zooming out to see the big picture. Think of having only a puzzle piece and then glancing at the jigsaw photo on the box.

    Monty Hall in 1976. Credit: Wikipedia and using Visual Paradigm Online for the puzzle effect

    Note: whereas analogies may be powerful, one should do so with caution, not to oversimplify. Physicists refer to this situation as the spherical cow method, where models may oversimplify complex phenomena.

    I admit that even with years of experience in applied statistics at times I still get confused at which method to apply. A large part of my thought process is identifying analogies to known solved problems. Sometimes after making progress in a direction I will realise that my assumptions were wrong and seek a new direction. I used to quip with colleagues that they shouldn’t trust me before my third attempt …Simulations are powerful but not always necessary

    It’s interesting to learn that Paul Erdős and other mathematicians were convinced only after seeing simulations of the problem.

    I am two-minded about usage of simulations when it comes to problem solving.

    On the one hand simulations are powerful tools to analyse complex and intractable problems. Especially in real life data in which one wants a grasp not only of the underlying formulation, but also stochasticity.

    And here is the big BUT — if a problem can be analytically solved like the Monty Hall one, simulations as fun as they may be, may not be necessary.

    According to Occam’s razor, all that is required is a brief intuition to explain the phenomena. This is what I attempted to do here by applying common sense and some basic probability reasoning. For those who enjoy deep dives I provide below supplementary sections with two methods for analytical solutions — one using Bayesian statistics and another using Causality.After publishing the first version of this article there was a comment that Savant’s solution³ may be simpler than those presented here. I revisited her communications and agreed that it should be added. In the process I realised three more lessons may be learnt.A well designed visual goes a long way

    Continuing the principle of Occam’s razor, Savant explained³ quite convincingly in my opinion:

    You should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?

    Hence she provided an abstract visual for the readers. I attempted to do the same with the 100 doors figures.

    Marilyn vos Savant who popularised the Monty Hall Problem. Credit: Ben David on Flickr under license

    As mentioned many readers, and especially with backgrounds in maths and statistics, still weren’t convinced.

    She revised³ with another mental image:

    The benefits of switching are readily proven by playing through the six games that exhaust all the possibilities. For the first three games, you choose #1 and “switch” each time, for the second three games, you choose #1 and “stay” each time, and the host always opens a loser. Here are the results.

    She added a table with all the scenarios. I took some artistic liberty and created the following figure. As indicated, the top batch are the scenarios in which the trader switches and the bottom when they switch. Lines in green are games which the trader wins, and in red when they get zonked. The symbolised the door chosen by the trader and Monte Hall then chooses a different door that has a goat behind it.

    Adaptation of Savant’s table³ of six scenarios that shows the solution to the Monty Hall Problem

    We clearly see from this diagram that the switcher has a ⅔ chance of winning and those that stay only ⅓.

    This is yet another elegant visualisation that clearly explains the non intuitive.

    It strengthens the claim that there is no real need for simulations in this case because all they would be doing is rerunning these six scenarios.

    One more popular solution is decision tree illustrations. You can find these in the Wikipedia page, but I find it’s a bit redundant to Savant’s table.

    The fact that we can solve this problem in so many ways yields another lesson:There are many ways to skin a … problem

    Of the many lessons that I have learnt from the writings of late Richard Feynman, one of the best physics and ideas communicators, is that a problem can be solved many ways. Mathematicians and Physicists do this all the time.

    A relevant quote that paraphrases Occam’s razor:

    If you can’t explain it simply, you don’t understand it well enough — attributed to Albert Einstein

    And finallyEmbrace ignorance and be humble ‍

    “You are utterly incorrect … How many irate mathematicians are needed to get you to change your mind?” — Ph.D from Georgetown University

    “May I suggest that you obtain and refer to a standard textbook on probability before you try to answer a question of this type again?” — Ph.D from University of Florida

    “You’re in error, but Albert Einstein earned a dearer place in the hearts of people after he admitted his errors.” — Ph.D. from University of Michigan

    Ouch!

    These are some of the said responses from mathematicians to the Parade article.

    Such unnecessary viciousness.

    You can check the reference³ to see the writer’s names and other like it. To whet your appetite: “You blew it, and you blew it big!”, , “You made a mistake, but look at the positive side. If all those Ph.D.’s were wrong, the country would be in some very serious trouble.”, “I am in shock that after being corrected by at least three mathematicians, you still do not see your mistake.”.

    And as expected from the 1990s perhaps the most embarrassing one was from a resident of Oregon:

    “Maybe women look at math problems differently than men.”

    These make me cringe and be embarrassed to be associated by gender and Ph.D. title with these graduates and professors.

    Hopefully in the 2020s most people are more humble about their ignorance. Yuval Noah Harari discusses the fact that the Scientific Revolution of Galileo Galilei et al. was not due to knowledge but rather admittance of ignorance.

    “The great discovery that launched the Scientific Revolution was the discovery that humans do not know the answers to their most important questions” — Yuval Noah Harari

    Fortunately for mathematicians’ image, there were also quiet a lot of more enlightened comments. I like this one from one Seth Kalson, Ph.D. of MIT:

    You are indeed correct. My colleagues at work had a ball with this problem, and I dare say that most of them, including me at first, thought you were wrong!

    We’ll summarise by examining how, and if, the Monty Hall problem may be applied in real-world settings, so you can try to relate to projects that you are working on.

    Application in Real World Settings

    Researching for this article I found that beyond artificial setups for entertainment⁶ ⁷ there aren’t practical settings for this problem to use as an analogy. Of course, I may be wrong⁸ and would be glad to hear if you know of one.

    One way of assessing the viability of an analogy is using arguments from causality which provides vocabulary that cannot be expressed with standard statistics.

    In a previous post I discussed the fact that the story behind the data is as important as the data itself. In particular Causal Graph Models visualise the story behind the data, which we will use as a framework for a reasonable analogy.

    For the Monty Hall problem we can build a Causal Graph Model like this:

    Reading:

    The door chosen by the trader is independent from that with the prize and vice versa. As important, there is no common cause between them that might generate a spurious correlation.

    The host’s choice depends on both and .

    By comparing causal graphs of two systems one can get a sense for how analogous both are. A perfect analogy would require more details, but this is beyond the scope of this article. Briefly, one would want to ensure similar functions between the parameters.

    Those interested in learning further details about using Causal Graphs Models to assess causality in real world problems may be interested in this article.

    Anecdotally it is also worth mentioning that on Let’s Make a Deal, Monty himself has admitted years later to be playing mind games with the contestants and did not always follow the rules, e.g, not always doing the intervention as “it all depends on his mood”⁴.

    In our setup we assumed perfect conditions, i.e., a host that does not skew from the script and/or play on the trader’s emotions. Taking this into consideration would require updating the Graphical Model above, which is beyond the scope of this article.

    Some might be disheartened to realise at this stage of the post that there might not be real world applications for this problem.

    I argue that lessons learnt from the Monty Hall problem definitely are.

    Just to summarise them again:Assessing probabilities can be counter intuitive …… especially when dealing with ambiguityWith new information we should update our beliefsBe one with subjectivityWhen confused — look for a useful analogy … but tread with cautionSimulations are powerful but not always necessaryA well designed visual goes a long wayThere are many ways to skin a … problemEmbrace ignorance and be humble ‍

    While the Monty Hall Problem might seem like a simple puzzle, it offers valuable insights into decision-making, particularly for data scientists. The problem highlights the importance of going beyond intuition and embracing a more analytical, data-driven approach. By understanding the principles of Bayesian thinking and updating our beliefs based on new information, we can make more informed decisions in many aspects of our lives, including data science. The Monty Hall Problem serves as a reminder that even seemingly straightforward scenarios can contain hidden complexities and that by carefully examining available information, we can uncover hidden truths and make better decisions.

    At the bottom of the article I provide a list of resources that I found useful to learn about this topic.

    Credit: Wikipedia

    Loved this post? Join me on LinkedIn or Buy me a coffee!

    Credits

    Unless otherwise noted, all images were created by the author.

    Many thanks to Jim Parr, Will Reynolds, and Betty Kazin for their useful comments.

    In the following supplementary sections I derive solutions to the Monty Hall’s problem from two perspectives:

    Bayesian

    Causal

    Both are motivated by questions in textbook: Causal Inference in Statistics A Primer by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell.

    Supplement 1: The Bayesian Point of View

    This section assumes a basic understanding of Bayes’ Theorem, in particular being comfortable conditional probabilities. In other words if this makes sense:

    We set out to use Bayes’ theorem to prove that switching doors improves chances in the N=3 Monty Hall Problem.We define

    X — the chosen door

    Y— the door with the prize

    Z — the door opened by the host

    Labelling the doors as A, B and C, without loss of generality, we need to solve for:

    Using Bayes’ theorem we equate the left side as

    and the right one as:

    Most components are equal=P=⅓ so we are left to prove:

    In the case where Y=B, the host has only one choice, making P= 1.

    In the case where Y=A, the host has two choices, making P= 1/2.

    From here:

    Quod erat demonstrandum.

    Note: if the “host choices” arguments didn’t make sense look at the table below showing this explicitly. You will want to compare entries {X=A, Y=B, Z=C} and {X=A, Y=A, Z=C}.

    Supplement 2: The Causal Point of View

    The section assumes a basic understanding of Directed Acyclic Graphsand Structural Causal Modelsis useful, but not required. In brief:

    DAGs qualitatively visualise the causal relationships between the parameter nodes.

    SCMs quantitatively express the formula relationships between the parameters.

    Given the DAG

    we are going to define the SCM that corresponds to the classic N=3 Monty Hall problem and use it to describe the joint distribution of all variables. We later will generically expand to N.We define

    X — the chosen door

    Y — the door with the prize

    Z — the door opened by the host

    According to the DAG we see that according to the chain rule:

    The SCM is defined by exogenous variables U , endogenous variables V, and the functions between them F:

    U = {X,Y}, V={Z}, F= {f}

    where X, Y and Z have door values:

    D = {A, B, C}

    The host choice is fdefined as:

    In order to generalise to N doors, the DAG remains the same, but the SCM requires to update D to be a set of N doors Dᵢ: {D₁, D₂, … Dₙ}.

    Exploring Example Scenarios

    To gain an intuition for this SCM, let’s examine 6 examples of 27:

    When X=YP= 0; cannot choose the participant’s door

    P= 1/2; is behind → chooses B at 50%

    P= 1/2; is behind → chooses C at 50%When X≠YP= 0; cannot choose the participant’s door

    P= 0; cannot choose prize door

    P= 1; has not choice in the matterCalculating Joint Probabilities

    Using logic let’s code up all 27 possibilities in python

    df = pd.DataFrame++, "Y":++)* 3, "Z":* 9})

    df= None

    p_x = 1./3

    p_y = 1./3

    df.loc= 0

    df.loc= 0.5

    df.loc= 0

    df.loc= 0

    df.loc= 1

    df= df* p_x * p_y

    print{df.sum}")

    df

    yields

    Resources

    This Quora discussion by Joshua Engel helped me shape a few aspects of this article.

    Causal Inference in Statistics A Primer / Pearl, Glymour & Jewell— excellent short text bookI also very much enjoy Tim Harford’s podcast Cautionary Tales. He wrote about this topic on November 3rd 2017 for the Financial Times: Monty Hall and the game show stick-or-switch conundrum

    Footnotes

    ¹ Vazsonyi, Andrew. “Which Door Has the Cadillac?”. Decision Line: 17–19. Archived from the originalon 13 April 2014. Retrieved 16 October 2012.

    ² Steve Selvin to the American Statistician in 1975.³Game Show Problem by Marilyn vos Savant’s “Ask Marilyn” in marilynvossavant.com: “This material in this article was originally published in PARADE magazine in 1990 and 1991”

    ⁴Tierney, John. “Behind Monty Hall’s Doors: Puzzle, Debate and Answer?”. The New York Times. Retrieved 18 January 2008.

    ⁵ Kahneman, D.. Thinking, fast and slow. Farrar, Straus and Giroux.

    ⁶ MythBusters Episode 177 “Pick a Door”Watch Mythbuster’s approach

    ⁶Monty Hall Problem on Survivor Season 41Watch Survivor’s take on the problem

    ⁷ Jingyi Jessica LiHow the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis.Whereas the author points about “similarities” between hypothesis testing and the Monty Hall problem, I think that this is a bit misleading. The author is correct that both problems change by the order in which processes are done, but that is part of Bayesian statistics in general, not limited to the Monty Hall problem.
    The post Lessons in Decision Making from the Monty Hall Problem appeared first on Towards Data Science.
    #lessons #decision #making #monty #hall
    🚪🚪🐐 Lessons in Decision Making from the Monty Hall Problem
    The Monty Hall Problem is a well-known brain teaser from which we can learn important lessons in Decision Making that are useful in general and in particular for data scientists. If you are not familiar with this problem, prepare to be perplexed . If you are, I hope to shine light on aspects that you might not have considered . I introduce the problem and solve with three types of intuitions: Common — The heart of this post focuses on applying our common sense to solve this problem. We’ll explore why it fails us and what we can do to intuitively overcome this to make the solution crystal clear . We’ll do this by using visuals , qualitative arguments and some basic probabilities. Bayesian — We will briefly discuss the importance of belief propagation. Causal — We will use a Graph Model to visualise conditions required to use the Monty Hall problem in real world settings.Spoiler alert I haven’t been convinced that there are any, but the thought process is very useful. I summarise by discussing lessons learnt for better data decision making. In regards to the Bayesian and Causal intuitions, these will be presented in a gentle form. For the mathematically inclined I also provide supplementary sections with short Deep Dives into each approach after the summary.By examining different aspects of this puzzle in probability you will hopefully be able to improve your data decision making . Credit: Wikipedia First, some history. Let’s Make a Deal is a USA television game show that originated in 1963. As its premise, audience participants were considered traders making deals with the host, Monty Hall . At the heart of the matter is an apparently simple scenario: A trader is posed with the question of choosing one of three doors for the opportunity to win a luxurious prize, e.g, a car . Behind the other two were goats . The trader is shown three closed doors. The trader chooses one of the doors. Let’s call thisdoor A and mark it with a . Keeping the chosen door closed, the host reveals one of the remaining doors showing a goat. The trader chooses door and the the host reveals door C showing a goat. The host then asks the trader if they would like to stick with their first choice or switch to the other remaining one. If the trader guesses correct they win the prize . If not they’ll be shown another goat. What is the probability of being Zonked? Credit: Wikipedia Should the trader stick with their original choice of door A or switch to B? Before reading further, give it a go. What would you do? Most people are likely to have a gut intuition that “it doesn’t matter” arguing that in the first instance each door had a ⅓ chance of hiding the prize, and that after the host intervention , when only two doors remain closed, the winning of the prize is 50:50. There are various ways of explaining why the coin toss intuition is incorrect. Most of these involve maths equations, or simulations. Whereas we will address these later, we’ll attempt to solve by applying Occam’s razor: A principle that states that simpler explanations are preferable to more complex ones — William of OckhamTo do this it is instructive to slightly redefine the problem to a large N doors instead of the original three. The Large N-Door Problem Similar to before: you have to choose one of many doors. For illustration let’s say N=100. Behind one of the doors there is the prize and behind 99of the rest are goats . The 100 Door Monty Hall problem before the host intervention. You choose one door and the host reveals 98of the other doors that have goats leaving yours and one more closed . The 100 Door Monty Hall Problem after the host intervention. Should you stick with your door or make the switch? Should you stick with your original choice or make the switch? I think you’ll agree with me that the remaining door, not chosen by you, is much more likely to conceal the prize … so you should definitely make the switch! It’s illustrative to compare both scenarios discussed so far. In the next figure we compare the post host intervention for the N=3 setupand that of N=100: Post intervention settings for the N=3 setupand N=100. In both cases we see two shut doors, one of which we’ve chosen. The main difference between these scenarios is that in the first we see one goat and in the second there are more than the eye would care to see. Why do most people consider the first case as a “50:50” toss up and in the second it’s obvious to make the switch? We’ll soon address this question of why. First let’s put probabilities of success behind the different scenarios. What’s The Frequency, Kenneth? So far we learnt from the N=100 scenario that switching doors is obviously beneficial. Inferring for the N=3 may be a leap of faith for most. Using some basic probability arguments here we’ll quantify why it is favourable to make the switch for any number door scenario N. We start with the standard Monty Hall problem. When it starts the probability of the prize being behind each of the doors A, B and C is p=⅓. To be explicit let’s define the Y parameter to be the door with the prize , i.e, p= p=p=⅓. The trick to solving this problem is that once the trader’s door A has been chosen , we should pay close attention to the set of the other doors {B,C}, which has the probability of p=p+p=⅔. This visual may help make sense of this: By being attentive to the {B,C} the rest should follow. When the goat is revealed it is apparent that the probabilities post intervention change. Note that for ease of reading I’ll drop the Y notation, where pwill read pand pwill read p. Also for completeness the full terms after the intervention should be even longer due to it being conditional, e.g, p, p, where Z is a parameter representing the choice of the host .premains ⅓ p=p+premains ⅔, p=0; we just learnt that the goat is behind door C, not the prize. p= p-p= ⅔ For anyone with the information provided by the hostthis means that it isn’t a toss of a fair coin! For them the fact that pbecame zero does not “raise all other boats”, but rather premains the same and pgets doubled. The bottom line is that the trader should consider p= ⅓ and p=⅔, hence by switching they are doubling the odds at winning! Let’s generalise to N. When we start all doors have odds of winning the prize p=1/N. After the trader chooses one door which we’ll call D₁, meaning p=1/N, we should now pay attention to the remaining set of doors {D₂, …, Dₙ} will have a chance of p=/N. When the host revealsdoors {D₃, …, Dₙ} with goats: premains 1/N p=p+p+… + premains/N p=p= …=p=p= 0; we just learnt that they have goats, not the prize. p=p— p— … — p=/N The trader should now consider two door values p=1/N and p=/N. Hence the odds of winning improved by a factor of N-1! In the case of N=100, this means by an odds ratio of 99!. The improvement of odds ratios in all scenarios between N=3 to 100 may be seen in the following graph. The thin line is the probability of winning by choosing any door prior to the intervention p=1/N. Note that it also represents the chance of winning after the intervention, if they decide to stick to their guns and not switch p.The thick line is the probability of winning the prize after the intervention if the door is switched p=/N: Probability of winning as a function of N. p=p=1/N is the thin line; p=N/is the thick one.Perhaps the most interesting aspect of this graphis that the N=3 case has the highest probability before the host intervention , but the lowest probability after and vice versa for N=100. Another interesting feature is the quick climb in the probability of winning for the switchers: N=3: p=67% N=4: p=75% N=5=80% The switchers curve gradually reaches an asymptote approaching at 100% whereas at N=99 it is 98.99% and at N=100 is equal to 99%. This starts to address an interesting question: Why Is Switching Obvious For Large N But Not N=3? The answer is the fact that this puzzle is slightly ambiguous. Only the highly attentive realise that by revealing the goatthe host is actually conveying a lot of information that should be incorporated into one’s calculation. Later we discuss the difference of doing this calculation in one’s mind based on intuition and slowing down by putting pen to paper or coding up the problem. How much information is conveyed by the host by intervening? A hand wavy explanation is that this information may be visualised as the gap between the lines in the graph above. For N=3 we saw that the odds of winning doubled, but that doesn’t register as strongly to our common sense intuition as the 99 factor as in the N=100. I have also considered describing stronger arguments from Information Theory that provide useful vocabulary to express communication of information. However, I feel that this fascinating field deserves a post of its own, which I’ve published. The main takeaway for the Monty Hall problem is that I have calculated the information gain to be a logarithmic function of the number of doors c using this formula: Information Gain due to the intervention of the host for a setup with c doors. Full details in my upcoming article. For c=3 door case, e.g, the information gain is ⅔ bits. Full details are in this article on entropy. To summarise this section, we use basic probability arguments to quantify the probabilities of winning the prize showing the benefit of switching for all N door scenarios. For those interested in more formal solutions using Bayesian and Causality on the bottom I provide supplement sections. In the next three final sections we’ll discuss how this problem was accepted in the general public back in the 1990s, discuss lessons learnt and then summarise how we can apply them in real-world settings. Being Confused Is OK “No, that is impossible, it should make no difference.” — Paul Erdős If you still don’t feel comfortable with the solution of the N=3 Monty Hall problem, don’t worry you are in good company! According to Vazsonyi¹ even Paul Erdős who is considered “of the greatest experts in probability theory” was confounded until computer simulations were demonstrated to him. When the original solution by Steve Selvin² was popularised by Marilyn vos Savant in her column “Ask Marilyn” in Parade magazine in 1990 many readers wrote that Selvin and Savant were wrong³. According to Tierney’s 1991 article in the New York Times, this included about 10,000 readers, including nearly 1,000 with Ph.D degrees⁴. On a personal note, over a decade ago I was exposed to the standard N=3 problem and since then managed to forget the solution numerous times. When I learnt about the large N approach I was quite excited about how intuitive it was. I then failed to explain it to my technical manager over lunch, so this is an attempt to compensate. I still have the same day job . While researching this piece I realised that there is a lot to learn in terms of decision making in general and in particular useful for data science. Lessons Learnt From Monty Hall Problem In his book Thinking Fast and Slow, the late Daniel Kahneman, the co-creator of Behaviour Economics, suggested that we have two types of thought processes: System 1 — fast thinking : based on intuition. This helps us react fast with confidence to familiar situations. System 2 – slow thinking : based on deep thought. This helps figure out new complex situations that life throws at us. Assuming this premise, you might have noticed that in the above you were applying both. By examining the visual of N=100 doors your System 1 kicked in and you immediately knew the answer. I’m guessing that in the N=3 you were straddling between System 1 and 2. Considering that you had to stop and think a bit when going throughout the probabilities exercise it was definitely System 2 . The decision maker’s struggle between System 1 and System 2 . Generated using Gemini Imagen 3 Beyond the fast and slow thinking I feel that there are a lot of data decision making lessons that may be learnt.Assessing probabilities can be counter-intuitive … or Be comfortable with shifting to deep thought We’ve clearly shown that in the N=3 case. As previously mentioned it confounded many people including prominent statisticians. Another classic example is The Birthday Paradox , which shows how we underestimate the likelihood of coincidences. In this problem most people would think that one needs a large group of people until they find a pair sharing the same birthday. It turns out that all you need is 23 to have a 50% chance. And 70 for a 99.9% chance. One of the most confusing paradoxes in the realm of data analysis is Simpson’s, which I detailed in a previous article. This is a situation where trends of a population may be reversed in its subpopulations. The common with all these paradoxes is them requiring us to get comfortable to shifting gears from System 1 fast thinking to System 2 slow . This is also the common theme for the lessons outlined below. A few more classical examples are: The Gambler’s Fallacy , Base Rate Fallacy and the The LindaProblem . These are beyond the scope of this article, but I highly recommend looking them up to further sharpen ways of thinking about data.… especially when dealing with ambiguity or Search for clarity in ambiguity Let’s reread the problem, this time as stated in “Ask Marilyn” Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say №1, and the host, who knows what’s behind the doors, opens another door, say №3, which has a goat. He then says to you, “Do you want to pick door №2?” Is it to your advantage to switch your choice? We discussed that the most important piece of information is not made explicit. It says that the host “knows what’s behind the doors”, but not that they open a door at random, although it’s implicitly understood that the host will never open the door with the car. Many real life problems in data science involve dealing with ambiguous demands as well as in data provided by stakeholders. It is crucial for the researcher to track down any relevant piece of information that is likely to have an impact and update that into the solution. Statisticians refer to this as “belief update”.With new information we should update our beliefs This is the main aspect separating the Bayesian stream of thought to the Frequentist. The Frequentist approach takes data at face value. The Bayesian approach incorporates prior beliefs and updates it when new findings are introduced. This is especially useful when dealing with ambiguous situations. To drive this point home, let’s re-examine this figure comparing between the post intervention N=3 setupsand the N=100 one. Copied from above. Post intervention settings for the N=3 setupand N=100. In both cases we had a prior belief that all doors had an equal chance of winning the prize p=1/N. Once the host opened one doora lot of valuable information was revealed whereas in the case of N=100 it was much more apparent than N=3. In the Frequentist approach, however, most of this information would be ignored, as it only focuses on the two closed doors. The Frequentist conclusion, hence is a 50% chance to win the prize regardless of what else is known about the situation. Hence the Frequentist takes Paul Erdős’ “no difference” point of view, which we now know to be incorrect. This would be reasonable if all that was presented were the two doors and not the intervention and the goats. However, if that information is presented, one should shift gears into System 2 thinking and update their beliefs in the system. This is what we have done by focusing not only on the shut door, but rather consider what was learnt about the system at large. For the brave hearted , in a supplementary section below called The Bayesian Point of View I solve for the Monty Hall problem using the Bayesian formalism.Be one with subjectivity The Frequentist main reservation about “going Bayes” is that — “Statistics should be objective”. The Bayesian response is — the Frequentist’s also apply a prior without realising it — a flat one. Regardless of the Bayesian/Frequentist debate, as researchers we try our best to be as objective as possible in every step of the analysis. That said, it is inevitable that subjective decisions are made throughout. E.g, in a skewed distribution should one quote the mean or median? It highly depends on the context and hence a subjective decision needs to be made. The responsibility of the analyst is to provide justification for their choices first to convince themselves and then their stakeholders.When confused — look for a useful analogy … but tread with caution We saw that by going from the N=3 setup to the N=100 the solution was apparent. This is a trick scientists frequently use — if the problem appears at first a bit too confusing/overwhelming, break it down and try to find a useful analogy. It is probably not a perfect comparison, but going from the N=3 setup to N=100 is like examining a picture from up close and zooming out to see the big picture. Think of having only a puzzle piece and then glancing at the jigsaw photo on the box. Monty Hall in 1976. Credit: Wikipedia and using Visual Paradigm Online for the puzzle effect Note: whereas analogies may be powerful, one should do so with caution, not to oversimplify. Physicists refer to this situation as the spherical cow method, where models may oversimplify complex phenomena. I admit that even with years of experience in applied statistics at times I still get confused at which method to apply. A large part of my thought process is identifying analogies to known solved problems. Sometimes after making progress in a direction I will realise that my assumptions were wrong and seek a new direction. I used to quip with colleagues that they shouldn’t trust me before my third attempt …Simulations are powerful but not always necessary It’s interesting to learn that Paul Erdős and other mathematicians were convinced only after seeing simulations of the problem. I am two-minded about usage of simulations when it comes to problem solving. On the one hand simulations are powerful tools to analyse complex and intractable problems. Especially in real life data in which one wants a grasp not only of the underlying formulation, but also stochasticity. And here is the big BUT — if a problem can be analytically solved like the Monty Hall one, simulations as fun as they may be, may not be necessary. According to Occam’s razor, all that is required is a brief intuition to explain the phenomena. This is what I attempted to do here by applying common sense and some basic probability reasoning. For those who enjoy deep dives I provide below supplementary sections with two methods for analytical solutions — one using Bayesian statistics and another using Causality.After publishing the first version of this article there was a comment that Savant’s solution³ may be simpler than those presented here. I revisited her communications and agreed that it should be added. In the process I realised three more lessons may be learnt.A well designed visual goes a long way Continuing the principle of Occam’s razor, Savant explained³ quite convincingly in my opinion: You should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you? Hence she provided an abstract visual for the readers. I attempted to do the same with the 100 doors figures. Marilyn vos Savant who popularised the Monty Hall Problem. Credit: Ben David on Flickr under license As mentioned many readers, and especially with backgrounds in maths and statistics, still weren’t convinced. She revised³ with another mental image: The benefits of switching are readily proven by playing through the six games that exhaust all the possibilities. For the first three games, you choose #1 and “switch” each time, for the second three games, you choose #1 and “stay” each time, and the host always opens a loser. Here are the results. She added a table with all the scenarios. I took some artistic liberty and created the following figure. As indicated, the top batch are the scenarios in which the trader switches and the bottom when they switch. Lines in green are games which the trader wins, and in red when they get zonked. The symbolised the door chosen by the trader and Monte Hall then chooses a different door that has a goat behind it. Adaptation of Savant’s table³ of six scenarios that shows the solution to the Monty Hall Problem We clearly see from this diagram that the switcher has a ⅔ chance of winning and those that stay only ⅓. This is yet another elegant visualisation that clearly explains the non intuitive. It strengthens the claim that there is no real need for simulations in this case because all they would be doing is rerunning these six scenarios. One more popular solution is decision tree illustrations. You can find these in the Wikipedia page, but I find it’s a bit redundant to Savant’s table. The fact that we can solve this problem in so many ways yields another lesson:There are many ways to skin a … problem Of the many lessons that I have learnt from the writings of late Richard Feynman, one of the best physics and ideas communicators, is that a problem can be solved many ways. Mathematicians and Physicists do this all the time. A relevant quote that paraphrases Occam’s razor: If you can’t explain it simply, you don’t understand it well enough — attributed to Albert Einstein And finallyEmbrace ignorance and be humble ‍ “You are utterly incorrect … How many irate mathematicians are needed to get you to change your mind?” — Ph.D from Georgetown University “May I suggest that you obtain and refer to a standard textbook on probability before you try to answer a question of this type again?” — Ph.D from University of Florida “You’re in error, but Albert Einstein earned a dearer place in the hearts of people after he admitted his errors.” — Ph.D. from University of Michigan Ouch! These are some of the said responses from mathematicians to the Parade article. Such unnecessary viciousness. You can check the reference³ to see the writer’s names and other like it. To whet your appetite: “You blew it, and you blew it big!”, , “You made a mistake, but look at the positive side. If all those Ph.D.’s were wrong, the country would be in some very serious trouble.”, “I am in shock that after being corrected by at least three mathematicians, you still do not see your mistake.”. And as expected from the 1990s perhaps the most embarrassing one was from a resident of Oregon: “Maybe women look at math problems differently than men.” These make me cringe and be embarrassed to be associated by gender and Ph.D. title with these graduates and professors. Hopefully in the 2020s most people are more humble about their ignorance. Yuval Noah Harari discusses the fact that the Scientific Revolution of Galileo Galilei et al. was not due to knowledge but rather admittance of ignorance. “The great discovery that launched the Scientific Revolution was the discovery that humans do not know the answers to their most important questions” — Yuval Noah Harari Fortunately for mathematicians’ image, there were also quiet a lot of more enlightened comments. I like this one from one Seth Kalson, Ph.D. of MIT: You are indeed correct. My colleagues at work had a ball with this problem, and I dare say that most of them, including me at first, thought you were wrong! We’ll summarise by examining how, and if, the Monty Hall problem may be applied in real-world settings, so you can try to relate to projects that you are working on. Application in Real World Settings Researching for this article I found that beyond artificial setups for entertainment⁶ ⁷ there aren’t practical settings for this problem to use as an analogy. Of course, I may be wrong⁸ and would be glad to hear if you know of one. One way of assessing the viability of an analogy is using arguments from causality which provides vocabulary that cannot be expressed with standard statistics. In a previous post I discussed the fact that the story behind the data is as important as the data itself. In particular Causal Graph Models visualise the story behind the data, which we will use as a framework for a reasonable analogy. For the Monty Hall problem we can build a Causal Graph Model like this: Reading: The door chosen by the trader is independent from that with the prize and vice versa. As important, there is no common cause between them that might generate a spurious correlation. The host’s choice depends on both and . By comparing causal graphs of two systems one can get a sense for how analogous both are. A perfect analogy would require more details, but this is beyond the scope of this article. Briefly, one would want to ensure similar functions between the parameters. Those interested in learning further details about using Causal Graphs Models to assess causality in real world problems may be interested in this article. Anecdotally it is also worth mentioning that on Let’s Make a Deal, Monty himself has admitted years later to be playing mind games with the contestants and did not always follow the rules, e.g, not always doing the intervention as “it all depends on his mood”⁴. In our setup we assumed perfect conditions, i.e., a host that does not skew from the script and/or play on the trader’s emotions. Taking this into consideration would require updating the Graphical Model above, which is beyond the scope of this article. Some might be disheartened to realise at this stage of the post that there might not be real world applications for this problem. I argue that lessons learnt from the Monty Hall problem definitely are. Just to summarise them again:Assessing probabilities can be counter intuitive …… especially when dealing with ambiguityWith new information we should update our beliefsBe one with subjectivityWhen confused — look for a useful analogy … but tread with cautionSimulations are powerful but not always necessaryA well designed visual goes a long wayThere are many ways to skin a … problemEmbrace ignorance and be humble ‍ While the Monty Hall Problem might seem like a simple puzzle, it offers valuable insights into decision-making, particularly for data scientists. The problem highlights the importance of going beyond intuition and embracing a more analytical, data-driven approach. By understanding the principles of Bayesian thinking and updating our beliefs based on new information, we can make more informed decisions in many aspects of our lives, including data science. The Monty Hall Problem serves as a reminder that even seemingly straightforward scenarios can contain hidden complexities and that by carefully examining available information, we can uncover hidden truths and make better decisions. At the bottom of the article I provide a list of resources that I found useful to learn about this topic. Credit: Wikipedia Loved this post? Join me on LinkedIn or Buy me a coffee! Credits Unless otherwise noted, all images were created by the author. Many thanks to Jim Parr, Will Reynolds, and Betty Kazin for their useful comments. In the following supplementary sections I derive solutions to the Monty Hall’s problem from two perspectives: Bayesian Causal Both are motivated by questions in textbook: Causal Inference in Statistics A Primer by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell. Supplement 1: The Bayesian Point of View This section assumes a basic understanding of Bayes’ Theorem, in particular being comfortable conditional probabilities. In other words if this makes sense: We set out to use Bayes’ theorem to prove that switching doors improves chances in the N=3 Monty Hall Problem.We define X — the chosen door Y— the door with the prize Z — the door opened by the host Labelling the doors as A, B and C, without loss of generality, we need to solve for: Using Bayes’ theorem we equate the left side as and the right one as: Most components are equal=P=⅓ so we are left to prove: In the case where Y=B, the host has only one choice, making P= 1. In the case where Y=A, the host has two choices, making P= 1/2. From here: Quod erat demonstrandum. Note: if the “host choices” arguments didn’t make sense look at the table below showing this explicitly. You will want to compare entries {X=A, Y=B, Z=C} and {X=A, Y=A, Z=C}. Supplement 2: The Causal Point of View The section assumes a basic understanding of Directed Acyclic Graphsand Structural Causal Modelsis useful, but not required. In brief: DAGs qualitatively visualise the causal relationships between the parameter nodes. SCMs quantitatively express the formula relationships between the parameters. Given the DAG we are going to define the SCM that corresponds to the classic N=3 Monty Hall problem and use it to describe the joint distribution of all variables. We later will generically expand to N.We define X — the chosen door Y — the door with the prize Z — the door opened by the host According to the DAG we see that according to the chain rule: The SCM is defined by exogenous variables U , endogenous variables V, and the functions between them F: U = {X,Y}, V={Z}, F= {f} where X, Y and Z have door values: D = {A, B, C} The host choice is fdefined as: In order to generalise to N doors, the DAG remains the same, but the SCM requires to update D to be a set of N doors Dᵢ: {D₁, D₂, … Dₙ}. Exploring Example Scenarios To gain an intuition for this SCM, let’s examine 6 examples of 27: When X=YP= 0; cannot choose the participant’s door P= 1/2; is behind → chooses B at 50% P= 1/2; is behind → chooses C at 50%When X≠YP= 0; cannot choose the participant’s door P= 0; cannot choose prize door P= 1; has not choice in the matterCalculating Joint Probabilities Using logic let’s code up all 27 possibilities in python df = pd.DataFrame++, "Y":++)* 3, "Z":* 9}) df= None p_x = 1./3 p_y = 1./3 df.loc= 0 df.loc= 0.5 df.loc= 0 df.loc= 0 df.loc= 1 df= df* p_x * p_y print{df.sum}") df yields Resources This Quora discussion by Joshua Engel helped me shape a few aspects of this article. Causal Inference in Statistics A Primer / Pearl, Glymour & Jewell— excellent short text bookI also very much enjoy Tim Harford’s podcast Cautionary Tales. He wrote about this topic on November 3rd 2017 for the Financial Times: Monty Hall and the game show stick-or-switch conundrum Footnotes ¹ Vazsonyi, Andrew. “Which Door Has the Cadillac?”. Decision Line: 17–19. Archived from the originalon 13 April 2014. Retrieved 16 October 2012. ² Steve Selvin to the American Statistician in 1975.³Game Show Problem by Marilyn vos Savant’s “Ask Marilyn” in marilynvossavant.com: “This material in this article was originally published in PARADE magazine in 1990 and 1991” ⁴Tierney, John. “Behind Monty Hall’s Doors: Puzzle, Debate and Answer?”. The New York Times. Retrieved 18 January 2008. ⁵ Kahneman, D.. Thinking, fast and slow. Farrar, Straus and Giroux. ⁶ MythBusters Episode 177 “Pick a Door”Watch Mythbuster’s approach ⁶Monty Hall Problem on Survivor Season 41Watch Survivor’s take on the problem ⁷ Jingyi Jessica LiHow the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis.Whereas the author points about “similarities” between hypothesis testing and the Monty Hall problem, I think that this is a bit misleading. The author is correct that both problems change by the order in which processes are done, but that is part of Bayesian statistics in general, not limited to the Monty Hall problem. The post 🚪🚪🐐 Lessons in Decision Making from the Monty Hall Problem appeared first on Towards Data Science. #lessons #decision #making #monty #hall
    TOWARDSDATASCIENCE.COM
    🚪🚪🐐 Lessons in Decision Making from the Monty Hall Problem
    The Monty Hall Problem is a well-known brain teaser from which we can learn important lessons in Decision Making that are useful in general and in particular for data scientists. If you are not familiar with this problem, prepare to be perplexed . If you are, I hope to shine light on aspects that you might not have considered . I introduce the problem and solve with three types of intuitions: Common — The heart of this post focuses on applying our common sense to solve this problem. We’ll explore why it fails us and what we can do to intuitively overcome this to make the solution crystal clear . We’ll do this by using visuals , qualitative arguments and some basic probabilities (not too deep, I promise). Bayesian — We will briefly discuss the importance of belief propagation. Causal — We will use a Graph Model to visualise conditions required to use the Monty Hall problem in real world settings.Spoiler alert I haven’t been convinced that there are any, but the thought process is very useful. I summarise by discussing lessons learnt for better data decision making. In regards to the Bayesian and Causal intuitions, these will be presented in a gentle form. For the mathematically inclined I also provide supplementary sections with short Deep Dives into each approach after the summary. (Note: These are not required to appreciate the main points of the article.) By examining different aspects of this puzzle in probability you will hopefully be able to improve your data decision making . Credit: Wikipedia First, some history. Let’s Make a Deal is a USA television game show that originated in 1963. As its premise, audience participants were considered traders making deals with the host, Monty Hall . At the heart of the matter is an apparently simple scenario: A trader is posed with the question of choosing one of three doors for the opportunity to win a luxurious prize, e.g, a car . Behind the other two were goats . The trader is shown three closed doors. The trader chooses one of the doors. Let’s call this (without loss of generalisability) door A and mark it with a . Keeping the chosen door closed, the host reveals one of the remaining doors showing a goat (let’s call this door C). The trader chooses door and the the host reveals door C showing a goat. The host then asks the trader if they would like to stick with their first choice or switch to the other remaining one (which we’ll call door B). If the trader guesses correct they win the prize . If not they’ll be shown another goat (also referred to as a zonk). What is the probability of being Zonked? Credit: Wikipedia Should the trader stick with their original choice of door A or switch to B? Before reading further, give it a go. What would you do? Most people are likely to have a gut intuition that “it doesn’t matter” arguing that in the first instance each door had a ⅓ chance of hiding the prize, and that after the host intervention , when only two doors remain closed, the winning of the prize is 50:50. There are various ways of explaining why the coin toss intuition is incorrect. Most of these involve maths equations, or simulations. Whereas we will address these later, we’ll attempt to solve by applying Occam’s razor: A principle that states that simpler explanations are preferable to more complex ones — William of Ockham (1287–1347) To do this it is instructive to slightly redefine the problem to a large N doors instead of the original three. The Large N-Door Problem Similar to before: you have to choose one of many doors. For illustration let’s say N=100. Behind one of the doors there is the prize and behind 99 (N-1) of the rest are goats . The 100 Door Monty Hall problem before the host intervention. You choose one door and the host reveals 98 (N-2) of the other doors that have goats leaving yours and one more closed . The 100 Door Monty Hall Problem after the host intervention. Should you stick with your door or make the switch? Should you stick with your original choice or make the switch? I think you’ll agree with me that the remaining door, not chosen by you, is much more likely to conceal the prize … so you should definitely make the switch! It’s illustrative to compare both scenarios discussed so far. In the next figure we compare the post host intervention for the N=3 setup (top panel) and that of N=100 (bottom): Post intervention settings for the N=3 setup (top) and N=100 (bottom). In both cases we see two shut doors, one of which we’ve chosen. The main difference between these scenarios is that in the first we see one goat and in the second there are more than the eye would care to see (unless you shepherd for a living). Why do most people consider the first case as a “50:50” toss up and in the second it’s obvious to make the switch? We’ll soon address this question of why. First let’s put probabilities of success behind the different scenarios. What’s The Frequency, Kenneth? So far we learnt from the N=100 scenario that switching doors is obviously beneficial. Inferring for the N=3 may be a leap of faith for most. Using some basic probability arguments here we’ll quantify why it is favourable to make the switch for any number door scenario N. We start with the standard Monty Hall problem (N=3). When it starts the probability of the prize being behind each of the doors A, B and C is p=⅓. To be explicit let’s define the Y parameter to be the door with the prize , i.e, p(Y=A)= p(Y=B)=p(Y=C)=⅓. The trick to solving this problem is that once the trader’s door A has been chosen , we should pay close attention to the set of the other doors {B,C}, which has the probability of p(Y∈{B,C})=p(Y=B)+p(Y=C)=⅔. This visual may help make sense of this: By being attentive to the {B,C} the rest should follow. When the goat is revealed it is apparent that the probabilities post intervention change. Note that for ease of reading I’ll drop the Y notation, where p(Y=A) will read p(A) and p(Y∈{B,C}) will read p({B,C}). Also for completeness the full terms after the intervention should be even longer due to it being conditional, e.g, p(Y=A|Z=C), p(Y∈{B,C}|Z=C), where Z is a parameter representing the choice of the host . (In the Bayesian supplement section below I use proper notation without this shortening.) p(A) remains ⅓ p({B,C})=p(B)+p(C) remains ⅔, p(C)=0; we just learnt that the goat is behind door C, not the prize. p(B)= p({B,C})-p(C) = ⅔ For anyone with the information provided by the host (meaning the trader and the audience) this means that it isn’t a toss of a fair coin! For them the fact that p(C) became zero does not “raise all other boats” (probabilities of doors A and B), but rather p(A) remains the same and p(B) gets doubled. The bottom line is that the trader should consider p(A) = ⅓ and p(B)=⅔, hence by switching they are doubling the odds at winning! Let’s generalise to N (to make the visual simpler we’ll use N=100 again as an analogy). When we start all doors have odds of winning the prize p=1/N. After the trader chooses one door which we’ll call D₁, meaning p(Y=D₁)=1/N, we should now pay attention to the remaining set of doors {D₂, …, Dₙ} will have a chance of p(Y∈{D₂, …, Dₙ})=(N-1)/N. When the host reveals (N-2) doors {D₃, …, Dₙ} with goats (back to short notation): p(D₁) remains 1/N p({D₂, …, Dₙ})=p(D₂)+p(D₃)+… + p(Dₙ) remains (N-1)/N p(D₃)=p(D₄)= …=p(Dₙ₋₁) =p(Dₙ) = 0; we just learnt that they have goats, not the prize. p(D₂)=p({D₂, …, Dₙ}) — p(D₃) — … — p(Dₙ)=(N-1)/N The trader should now consider two door values p(D₁)=1/N and p(D₂)=(N-1)/N. Hence the odds of winning improved by a factor of N-1! In the case of N=100, this means by an odds ratio of 99! (i.e, 99% likely to win a prize when switching vs. 1% if not). The improvement of odds ratios in all scenarios between N=3 to 100 may be seen in the following graph. The thin line is the probability of winning by choosing any door prior to the intervention p(Y)=1/N. Note that it also represents the chance of winning after the intervention, if they decide to stick to their guns and not switch p(Y=D₁|Z={D₃…Dₙ}). (Here I reintroduce the more rigorous conditional form mentioned earlier.) The thick line is the probability of winning the prize after the intervention if the door is switched p(Y=D₂|Z={D₃…Dₙ})=(N-1)/N: Probability of winning as a function of N. p(Y)=p(Y=no switch|Z)=1/N is the thin line; p(Y=switch|Z)=N/(N-1) is the thick one. (By definition the sum of both lines is 1 for each N.) Perhaps the most interesting aspect of this graph (albeit also by definition) is that the N=3 case has the highest probability before the host intervention , but the lowest probability after and vice versa for N=100. Another interesting feature is the quick climb in the probability of winning for the switchers: N=3: p=67% N=4: p=75% N=5=80% The switchers curve gradually reaches an asymptote approaching at 100% whereas at N=99 it is 98.99% and at N=100 is equal to 99%. This starts to address an interesting question: Why Is Switching Obvious For Large N But Not N=3? The answer is the fact that this puzzle is slightly ambiguous. Only the highly attentive realise that by revealing the goat (and never the prize!) the host is actually conveying a lot of information that should be incorporated into one’s calculation. Later we discuss the difference of doing this calculation in one’s mind based on intuition and slowing down by putting pen to paper or coding up the problem. How much information is conveyed by the host by intervening? A hand wavy explanation is that this information may be visualised as the gap between the lines in the graph above. For N=3 we saw that the odds of winning doubled (nothing to sneeze at!), but that doesn’t register as strongly to our common sense intuition as the 99 factor as in the N=100. I have also considered describing stronger arguments from Information Theory that provide useful vocabulary to express communication of information. However, I feel that this fascinating field deserves a post of its own, which I’ve published. The main takeaway for the Monty Hall problem is that I have calculated the information gain to be a logarithmic function of the number of doors c using this formula: Information Gain due to the intervention of the host for a setup with c doors. Full details in my upcoming article. For c=3 door case, e.g, the information gain is ⅔ bits (of a maximum possible 1.58 bits). Full details are in this article on entropy. To summarise this section, we use basic probability arguments to quantify the probabilities of winning the prize showing the benefit of switching for all N door scenarios. For those interested in more formal solutions using Bayesian and Causality on the bottom I provide supplement sections. In the next three final sections we’ll discuss how this problem was accepted in the general public back in the 1990s, discuss lessons learnt and then summarise how we can apply them in real-world settings. Being Confused Is OK “No, that is impossible, it should make no difference.” — Paul Erdős If you still don’t feel comfortable with the solution of the N=3 Monty Hall problem, don’t worry you are in good company! According to Vazsonyi (1999)¹ even Paul Erdős who is considered “of the greatest experts in probability theory” was confounded until computer simulations were demonstrated to him. When the original solution by Steve Selvin (1975)² was popularised by Marilyn vos Savant in her column “Ask Marilyn” in Parade magazine in 1990 many readers wrote that Selvin and Savant were wrong³. According to Tierney’s 1991 article in the New York Times, this included about 10,000 readers, including nearly 1,000 with Ph.D degrees⁴. On a personal note, over a decade ago I was exposed to the standard N=3 problem and since then managed to forget the solution numerous times. When I learnt about the large N approach I was quite excited about how intuitive it was. I then failed to explain it to my technical manager over lunch, so this is an attempt to compensate. I still have the same day job . While researching this piece I realised that there is a lot to learn in terms of decision making in general and in particular useful for data science. Lessons Learnt From Monty Hall Problem In his book Thinking Fast and Slow, the late Daniel Kahneman, the co-creator of Behaviour Economics, suggested that we have two types of thought processes: System 1 — fast thinking : based on intuition. This helps us react fast with confidence to familiar situations. System 2 – slow thinking : based on deep thought. This helps figure out new complex situations that life throws at us. Assuming this premise, you might have noticed that in the above you were applying both. By examining the visual of N=100 doors your System 1 kicked in and you immediately knew the answer. I’m guessing that in the N=3 you were straddling between System 1 and 2. Considering that you had to stop and think a bit when going throughout the probabilities exercise it was definitely System 2 . The decision maker’s struggle between System 1 and System 2 . Generated using Gemini Imagen 3 Beyond the fast and slow thinking I feel that there are a lot of data decision making lessons that may be learnt. (1) Assessing probabilities can be counter-intuitive … or Be comfortable with shifting to deep thought We’ve clearly shown that in the N=3 case. As previously mentioned it confounded many people including prominent statisticians. Another classic example is The Birthday Paradox , which shows how we underestimate the likelihood of coincidences. In this problem most people would think that one needs a large group of people until they find a pair sharing the same birthday. It turns out that all you need is 23 to have a 50% chance. And 70 for a 99.9% chance. One of the most confusing paradoxes in the realm of data analysis is Simpson’s, which I detailed in a previous article. This is a situation where trends of a population may be reversed in its subpopulations. The common with all these paradoxes is them requiring us to get comfortable to shifting gears from System 1 fast thinking to System 2 slow . This is also the common theme for the lessons outlined below. A few more classical examples are: The Gambler’s Fallacy , Base Rate Fallacy and the The Linda [bank teller] Problem . These are beyond the scope of this article, but I highly recommend looking them up to further sharpen ways of thinking about data. (2) … especially when dealing with ambiguity or Search for clarity in ambiguity Let’s reread the problem, this time as stated in “Ask Marilyn” Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say №1, and the host, who knows what’s behind the doors, opens another door, say №3, which has a goat. He then says to you, “Do you want to pick door №2?” Is it to your advantage to switch your choice? We discussed that the most important piece of information is not made explicit. It says that the host “knows what’s behind the doors”, but not that they open a door at random, although it’s implicitly understood that the host will never open the door with the car. Many real life problems in data science involve dealing with ambiguous demands as well as in data provided by stakeholders. It is crucial for the researcher to track down any relevant piece of information that is likely to have an impact and update that into the solution. Statisticians refer to this as “belief update”. (3) With new information we should update our beliefs This is the main aspect separating the Bayesian stream of thought to the Frequentist. The Frequentist approach takes data at face value (referred to as flat priors). The Bayesian approach incorporates prior beliefs and updates it when new findings are introduced. This is especially useful when dealing with ambiguous situations. To drive this point home, let’s re-examine this figure comparing between the post intervention N=3 setups (top panel) and the N=100 one (bottom panel). Copied from above. Post intervention settings for the N=3 setup (top) and N=100 (bottom). In both cases we had a prior belief that all doors had an equal chance of winning the prize p=1/N. Once the host opened one door (N=3; or 98 doors when N=100) a lot of valuable information was revealed whereas in the case of N=100 it was much more apparent than N=3. In the Frequentist approach, however, most of this information would be ignored, as it only focuses on the two closed doors. The Frequentist conclusion, hence is a 50% chance to win the prize regardless of what else is known about the situation. Hence the Frequentist takes Paul Erdős’ “no difference” point of view, which we now know to be incorrect. This would be reasonable if all that was presented were the two doors and not the intervention and the goats. However, if that information is presented, one should shift gears into System 2 thinking and update their beliefs in the system. This is what we have done by focusing not only on the shut door, but rather consider what was learnt about the system at large. For the brave hearted , in a supplementary section below called The Bayesian Point of View I solve for the Monty Hall problem using the Bayesian formalism. (4) Be one with subjectivity The Frequentist main reservation about “going Bayes” is that — “Statistics should be objective”. The Bayesian response is — the Frequentist’s also apply a prior without realising it — a flat one. Regardless of the Bayesian/Frequentist debate, as researchers we try our best to be as objective as possible in every step of the analysis. That said, it is inevitable that subjective decisions are made throughout. E.g, in a skewed distribution should one quote the mean or median? It highly depends on the context and hence a subjective decision needs to be made. The responsibility of the analyst is to provide justification for their choices first to convince themselves and then their stakeholders. (5) When confused — look for a useful analogy … but tread with caution We saw that by going from the N=3 setup to the N=100 the solution was apparent. This is a trick scientists frequently use — if the problem appears at first a bit too confusing/overwhelming, break it down and try to find a useful analogy. It is probably not a perfect comparison, but going from the N=3 setup to N=100 is like examining a picture from up close and zooming out to see the big picture. Think of having only a puzzle piece and then glancing at the jigsaw photo on the box. Monty Hall in 1976. Credit: Wikipedia and using Visual Paradigm Online for the puzzle effect Note: whereas analogies may be powerful, one should do so with caution, not to oversimplify. Physicists refer to this situation as the spherical cow method, where models may oversimplify complex phenomena. I admit that even with years of experience in applied statistics at times I still get confused at which method to apply. A large part of my thought process is identifying analogies to known solved problems. Sometimes after making progress in a direction I will realise that my assumptions were wrong and seek a new direction. I used to quip with colleagues that they shouldn’t trust me before my third attempt … (6) Simulations are powerful but not always necessary It’s interesting to learn that Paul Erdős and other mathematicians were convinced only after seeing simulations of the problem. I am two-minded about usage of simulations when it comes to problem solving. On the one hand simulations are powerful tools to analyse complex and intractable problems. Especially in real life data in which one wants a grasp not only of the underlying formulation, but also stochasticity. And here is the big BUT — if a problem can be analytically solved like the Monty Hall one, simulations as fun as they may be (such as the MythBusters have done⁶), may not be necessary. According to Occam’s razor, all that is required is a brief intuition to explain the phenomena. This is what I attempted to do here by applying common sense and some basic probability reasoning. For those who enjoy deep dives I provide below supplementary sections with two methods for analytical solutions — one using Bayesian statistics and another using Causality. [Update] After publishing the first version of this article there was a comment that Savant’s solution³ may be simpler than those presented here. I revisited her communications and agreed that it should be added. In the process I realised three more lessons may be learnt. (7) A well designed visual goes a long way Continuing the principle of Occam’s razor, Savant explained³ quite convincingly in my opinion: You should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you? Hence she provided an abstract visual for the readers. I attempted to do the same with the 100 doors figures. Marilyn vos Savant who popularised the Monty Hall Problem. Credit: Ben David on Flickr under license As mentioned many readers, and especially with backgrounds in maths and statistics, still weren’t convinced. She revised³ with another mental image: The benefits of switching are readily proven by playing through the six games that exhaust all the possibilities. For the first three games, you choose #1 and “switch” each time, for the second three games, you choose #1 and “stay” each time, and the host always opens a loser. Here are the results. She added a table with all the scenarios. I took some artistic liberty and created the following figure. As indicated, the top batch are the scenarios in which the trader switches and the bottom when they switch. Lines in green are games which the trader wins, and in red when they get zonked. The symbolised the door chosen by the trader and Monte Hall then chooses a different door that has a goat behind it. Adaptation of Savant’s table³ of six scenarios that shows the solution to the Monty Hall Problem We clearly see from this diagram that the switcher has a ⅔ chance of winning and those that stay only ⅓. This is yet another elegant visualisation that clearly explains the non intuitive. It strengthens the claim that there is no real need for simulations in this case because all they would be doing is rerunning these six scenarios. One more popular solution is decision tree illustrations. You can find these in the Wikipedia page, but I find it’s a bit redundant to Savant’s table. The fact that we can solve this problem in so many ways yields another lesson: (8) There are many ways to skin a … problem Of the many lessons that I have learnt from the writings of late Richard Feynman, one of the best physics and ideas communicators, is that a problem can be solved many ways. Mathematicians and Physicists do this all the time. A relevant quote that paraphrases Occam’s razor: If you can’t explain it simply, you don’t understand it well enough — attributed to Albert Einstein And finally (9) Embrace ignorance and be humble ‍ “You are utterly incorrect … How many irate mathematicians are needed to get you to change your mind?” — Ph.D from Georgetown University “May I suggest that you obtain and refer to a standard textbook on probability before you try to answer a question of this type again?” — Ph.D from University of Florida “You’re in error, but Albert Einstein earned a dearer place in the hearts of people after he admitted his errors.” — Ph.D. from University of Michigan Ouch! These are some of the said responses from mathematicians to the Parade article. Such unnecessary viciousness. You can check the reference³ to see the writer’s names and other like it. To whet your appetite: “You blew it, and you blew it big!”, , “You made a mistake, but look at the positive side. If all those Ph.D.’s were wrong, the country would be in some very serious trouble.”, “I am in shock that after being corrected by at least three mathematicians, you still do not see your mistake.”. And as expected from the 1990s perhaps the most embarrassing one was from a resident of Oregon: “Maybe women look at math problems differently than men.” These make me cringe and be embarrassed to be associated by gender and Ph.D. title with these graduates and professors. Hopefully in the 2020s most people are more humble about their ignorance. Yuval Noah Harari discusses the fact that the Scientific Revolution of Galileo Galilei et al. was not due to knowledge but rather admittance of ignorance. “The great discovery that launched the Scientific Revolution was the discovery that humans do not know the answers to their most important questions” — Yuval Noah Harari Fortunately for mathematicians’ image, there were also quiet a lot of more enlightened comments. I like this one from one Seth Kalson, Ph.D. of MIT: You are indeed correct. My colleagues at work had a ball with this problem, and I dare say that most of them, including me at first, thought you were wrong! We’ll summarise by examining how, and if, the Monty Hall problem may be applied in real-world settings, so you can try to relate to projects that you are working on. Application in Real World Settings Researching for this article I found that beyond artificial setups for entertainment⁶ ⁷ there aren’t practical settings for this problem to use as an analogy. Of course, I may be wrong⁸ and would be glad to hear if you know of one. One way of assessing the viability of an analogy is using arguments from causality which provides vocabulary that cannot be expressed with standard statistics. In a previous post I discussed the fact that the story behind the data is as important as the data itself. In particular Causal Graph Models visualise the story behind the data, which we will use as a framework for a reasonable analogy. For the Monty Hall problem we can build a Causal Graph Model like this: Reading: The door chosen by the trader is independent from that with the prize and vice versa. As important, there is no common cause between them that might generate a spurious correlation. The host’s choice depends on both and . By comparing causal graphs of two systems one can get a sense for how analogous both are. A perfect analogy would require more details, but this is beyond the scope of this article. Briefly, one would want to ensure similar functions between the parameters (referred to as the Structural Causal Model; for details see in the supplementary section below called The Causal Point of View). Those interested in learning further details about using Causal Graphs Models to assess causality in real world problems may be interested in this article. Anecdotally it is also worth mentioning that on Let’s Make a Deal, Monty himself has admitted years later to be playing mind games with the contestants and did not always follow the rules, e.g, not always doing the intervention as “it all depends on his mood”⁴. In our setup we assumed perfect conditions, i.e., a host that does not skew from the script and/or play on the trader’s emotions. Taking this into consideration would require updating the Graphical Model above, which is beyond the scope of this article. Some might be disheartened to realise at this stage of the post that there might not be real world applications for this problem. I argue that lessons learnt from the Monty Hall problem definitely are. Just to summarise them again: (1) Assessing probabilities can be counter intuitive …(Be comfortable with shifting to deep thought ) (2) … especially when dealing with ambiguity(Search for clarity ) (3) With new information we should update our beliefs (4) Be one with subjectivity (5) When confused — look for a useful analogy … but tread with caution (6) Simulations are powerful but not always necessary (7) A well designed visual goes a long way (8) There are many ways to skin a … problem (9) Embrace ignorance and be humble ‍ While the Monty Hall Problem might seem like a simple puzzle, it offers valuable insights into decision-making, particularly for data scientists. The problem highlights the importance of going beyond intuition and embracing a more analytical, data-driven approach. By understanding the principles of Bayesian thinking and updating our beliefs based on new information, we can make more informed decisions in many aspects of our lives, including data science. The Monty Hall Problem serves as a reminder that even seemingly straightforward scenarios can contain hidden complexities and that by carefully examining available information, we can uncover hidden truths and make better decisions. At the bottom of the article I provide a list of resources that I found useful to learn about this topic. Credit: Wikipedia Loved this post? Join me on LinkedIn or Buy me a coffee! Credits Unless otherwise noted, all images were created by the author. Many thanks to Jim Parr, Will Reynolds, and Betty Kazin for their useful comments. In the following supplementary sections I derive solutions to the Monty Hall’s problem from two perspectives: Bayesian Causal Both are motivated by questions in textbook: Causal Inference in Statistics A Primer by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell (2016). Supplement 1: The Bayesian Point of View This section assumes a basic understanding of Bayes’ Theorem, in particular being comfortable conditional probabilities. In other words if this makes sense: We set out to use Bayes’ theorem to prove that switching doors improves chances in the N=3 Monty Hall Problem. (Problem 1.3.3 of the Primer textbook.) We define X — the chosen door Y— the door with the prize Z — the door opened by the host Labelling the doors as A, B and C, without loss of generality, we need to solve for: Using Bayes’ theorem we equate the left side as and the right one as: Most components are equal (remember that P(Y=A)=P(Y=B)=⅓ so we are left to prove: In the case where Y=B (the prize is behind door B ), the host has only one choice (can only select door C ), making P(X=A, Z=C|Y=B)= 1. In the case where Y=A (the prize is behind door A ), the host has two choices (doors B and C ) , making P(X=A, Z=C|Y=A)= 1/2. From here: Quod erat demonstrandum. Note: if the “host choices” arguments didn’t make sense look at the table below showing this explicitly. You will want to compare entries {X=A, Y=B, Z=C} and {X=A, Y=A, Z=C}. Supplement 2: The Causal Point of View The section assumes a basic understanding of Directed Acyclic Graphs (DAGs) and Structural Causal Models (SCMs) is useful, but not required. In brief: DAGs qualitatively visualise the causal relationships between the parameter nodes. SCMs quantitatively express the formula relationships between the parameters. Given the DAG we are going to define the SCM that corresponds to the classic N=3 Monty Hall problem and use it to describe the joint distribution of all variables. We later will generically expand to N. (Inspired by problem 1.5.4 of the Primer textbook as well as its brief mention of the N door problem.) We define X — the chosen door Y — the door with the prize Z — the door opened by the host According to the DAG we see that according to the chain rule: The SCM is defined by exogenous variables U , endogenous variables V, and the functions between them F: U = {X,Y}, V={Z}, F= {f(Z)} where X, Y and Z have door values: D = {A, B, C} The host choice is f(Z) defined as: In order to generalise to N doors, the DAG remains the same, but the SCM requires to update D to be a set of N doors Dᵢ: {D₁, D₂, … Dₙ}. Exploring Example Scenarios To gain an intuition for this SCM, let’s examine 6 examples of 27 (=3³) : When X=Y (i.e., the prize is behind the chosen door ) P(Z=A|X=A, Y=A) = 0; cannot choose the participant’s door P(Z=B|X=A, Y=A) = 1/2; is behind → chooses B at 50% P(Z=C|X=A, Y=A) = 1/2; is behind → chooses C at 50%(complementary to the above) When X≠Y (i.e., the prize is not behind the chosen door ) P(Z=A|X=A, Y=B) = 0; cannot choose the participant’s door P(Z=B|X=A, Y=B) = 0; cannot choose prize door P(Z=C|X=A, Y=B) = 1; has not choice in the matter(complementary to the above) Calculating Joint Probabilities Using logic let’s code up all 27 possibilities in python df = pd.DataFrame({"X": (["A"] * 9) + (["B"] * 9) + (["C"] * 9), "Y": ((["A"] * 3) + (["B"] * 3) + (["C"] * 3) )* 3, "Z": ["A", "B", "C"] * 9}) df["P(Z|X,Y)"] = None p_x = 1./3 p_y = 1./3 df.loc[df.query("X == Y == Z").index, "P(Z|X,Y)"] = 0 df.loc[df.query("X == Y != Z").index, "P(Z|X,Y)"] = 0.5 df.loc[df.query("X != Y == Z").index, "P(Z|X,Y)"] = 0 df.loc[df.query("Z == X != Y").index, "P(Z|X,Y)"] = 0 df.loc[df.query("X != Y").query("Z != Y").query("Z != X").index, "P(Z|X,Y)"] = 1 df["P(X, Y, Z)"] = df["P(Z|X,Y)"] * p_x * p_y print(f"Testing normalisation of P(X,Y,Z) {df['P(X, Y, Z)'].sum()}") df yields Resources This Quora discussion by Joshua Engel helped me shape a few aspects of this article. Causal Inference in Statistics A Primer / Pearl, Glymour & Jewell (2016) — excellent short text book (site) I also very much enjoy Tim Harford’s podcast Cautionary Tales. He wrote about this topic on November 3rd 2017 for the Financial Times: Monty Hall and the game show stick-or-switch conundrum Footnotes ¹ Vazsonyi, Andrew (December 1998 — January 1999). “Which Door Has the Cadillac?” (PDF). Decision Line: 17–19. Archived from the original (PDF) on 13 April 2014. Retrieved 16 October 2012. ² Steve Selvin to the American Statistician in 1975.[1][2] ³Game Show Problem by Marilyn vos Savant’s “Ask Marilyn” in marilynvossavant.com (web archive): “This material in this article was originally published in PARADE magazine in 1990 and 1991” ⁴Tierney, John (21 July 1991). “Behind Monty Hall’s Doors: Puzzle, Debate and Answer?”. The New York Times. Retrieved 18 January 2008. ⁵ Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux. ⁶ MythBusters Episode 177 “Pick a Door” (Wikipedia) Watch Mythbuster’s approach ⁶Monty Hall Problem on Survivor Season 41 (LinkedIn, YouTube) Watch Survivor’s take on the problem ⁷ Jingyi Jessica Li (2024) How the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis.Whereas the author points about “similarities” between hypothesis testing and the Monty Hall problem, I think that this is a bit misleading. The author is correct that both problems change by the order in which processes are done, but that is part of Bayesian statistics in general, not limited to the Monty Hall problem. The post 🚪🚪🐐 Lessons in Decision Making from the Monty Hall Problem appeared first on Towards Data Science.
    0 Comentários 0 Compartilhamentos 0 Anterior
  • How close is quantum computing to commercial reality?

    Quantum computing may still be regarded by many IT leaders as a very niche technology, but broader business use cases may be just a few years away.
    While only a handful of companies have machines with logical qubits today, delegates at the Commercialising Quantum Computing conference in London were told that a machine with 100 logical qubits would offer quantum advantage in material science by 2028.
    This means that, by then, a sufficiently powerful and stable quantum computer would start delivering business value better than what would be possible using high performance computing.

    Mark Jackson, senior quantum evangelist at Quantinuum, said the company was already using generative quantum artificial intelligence. In a fireside chat at the conference, Jackson spoke about the interaction between quantum computing and AI.
    It is largely acknowledged that a quantum computer is not good at providing a precise answer, such as if applied to big data analysis. But, according to Jackson, it shines when used for machine learning, which can be applied to identify a correct answer. Quantum-enhanced machine learning can process large datasets far quicker than conventional computers, especially when applied to detecting patterns.
    “Quantum computers can detect patterns that would be missed by other conventional computing methods,” said Jackson.
    This ability to detect patterns in massive datasets could revolutionise cyber security. Becky Pickard, managing director of global cyber operations at Barclays, pointed out during a panel discussion that a lot of progress has been made with machine learning and how to apply it on a day-to-day basis: “We’re working with massive volumes of data – 12Tbytes – on a daily basis.”
    She suggested that quantum machine learning could help. From an optimisation perspective, she is keen to see the development of quantum computing applied in a way that reshapes cyber defence. 

    HSBC is one of the organisations that has been working on quantum computing for several years.
    Discussing the return on investment opportunity, and how quantum computing can be used to build more optimised financial models, Phil Intallura, global head of quantum technologies at HSBC, said: “When you breakdown the opportunities, financial services is one of the biggest beneficiaries.”
    As Intallura points out, banks are always looking for a better financial model: “There’s one thing that catalyses commercial organisations more than anything else, and that’s confidence. If you can show a solution using quantum technology that can get a better output based than using a supercomputers,will give you much more runway than you need.”
    Another application area is the ability to generate a true random number, which can feed into financial model simulations.
    In March, a team of researchers from JPMorganChase, Quantinuum, Argonne National Laboratory, Oak Ridge National Laboratory, and the University of Texas at Austin published a paper in Nature discussing a technique known as Random Circuit Sampling.
    RCS is used to perform a certified-randomness-expansion protocol, which outputs more randomness than it takes as input. It is a task that is often used to demonstrate quantum supremacy since it cannot be achieved on a classical computer.
    Speaking of the usefulness of a quantum number generator at HSBC, Intallura said: “Using quantum random numbers as your entropy source to classical simulation does not change any of the underlying model practices in classical models. You’re just injecting a different source of entropy than what we woulduse.”

    For Intallura, regulatory pressure and the need to ensure financial transactions are secure is helping to inform quantum computing plans at financial institutions. 
    The US National Institute of Standards and Technology has ratified a number of post-quantum cryptographystandards. Banks face pressure from regulators to replace RSA-2048 encryption by 2035 and migrate fully over to quantum safe encryption standards to protect banking transactions.But, as Mark Carney, lead of quantum cyber security research at Santander Global, noted, post-quantum cryptography needs both software and hardware acceleration.
    “We want to be able to have PQC at speed in our devices and on our payment cards,” he said. “We want to give our customers the very best cryptography that we possibly can – not just for regulatory purposes, but also because it gives a sense of assurance.”
    Among the promises of quantum computing is that it can be applied to solve complex optimisation problems. As and when they become commercially viable, such systems will need to work alongside traditional enterprise IT.
    This is something that Gerard Mullery, interim CEO of Oxford Quantum Circuits, recognised during his presentation at the event. Mullery sees a need for quantum computing to be embedded in enterprise workflows.
     “As AI agents autonomously orchestrate enterprise workflows, quantum compute platforms must be designed to integrate with them,” he added.
    What is clear from the experts who spoke at the Commercialising Quantum Computing conference is that a useful machine is perhaps only a few years away. This will have enough logical qubits to solve real-world problems.
    As such devices evolve, it is likely more organisations will draw on quantum computing for certain combinatorial optimisation problems, which will need to integrate with classical computing in the datacentre. As quantum computing becomes more accessible, there will also be a need to bolster cryptography with PQC.

    about quantum developments

    Cisco lays out plans for networking in era of quantum computing: The network equipment provider has opened a new lab and developed a prototype chip as it fleshes out its quantum networking strategy.
    Quantum datacentre deployments: How they are supporting evolving compute projects: Quantum datacentre deployments are emerging worldwide, so what are they and where are the benefits?
    #how #close #quantum #computing #commercial
    How close is quantum computing to commercial reality?
    Quantum computing may still be regarded by many IT leaders as a very niche technology, but broader business use cases may be just a few years away. While only a handful of companies have machines with logical qubits today, delegates at the Commercialising Quantum Computing conference in London were told that a machine with 100 logical qubits would offer quantum advantage in material science by 2028. This means that, by then, a sufficiently powerful and stable quantum computer would start delivering business value better than what would be possible using high performance computing. Mark Jackson, senior quantum evangelist at Quantinuum, said the company was already using generative quantum artificial intelligence. In a fireside chat at the conference, Jackson spoke about the interaction between quantum computing and AI. It is largely acknowledged that a quantum computer is not good at providing a precise answer, such as if applied to big data analysis. But, according to Jackson, it shines when used for machine learning, which can be applied to identify a correct answer. Quantum-enhanced machine learning can process large datasets far quicker than conventional computers, especially when applied to detecting patterns. “Quantum computers can detect patterns that would be missed by other conventional computing methods,” said Jackson. This ability to detect patterns in massive datasets could revolutionise cyber security. Becky Pickard, managing director of global cyber operations at Barclays, pointed out during a panel discussion that a lot of progress has been made with machine learning and how to apply it on a day-to-day basis: “We’re working with massive volumes of data – 12Tbytes – on a daily basis.” She suggested that quantum machine learning could help. From an optimisation perspective, she is keen to see the development of quantum computing applied in a way that reshapes cyber defence.  HSBC is one of the organisations that has been working on quantum computing for several years. Discussing the return on investment opportunity, and how quantum computing can be used to build more optimised financial models, Phil Intallura, global head of quantum technologies at HSBC, said: “When you breakdown the opportunities, financial services is one of the biggest beneficiaries.” As Intallura points out, banks are always looking for a better financial model: “There’s one thing that catalyses commercial organisations more than anything else, and that’s confidence. If you can show a solution using quantum technology that can get a better output based than using a supercomputers,will give you much more runway than you need.” Another application area is the ability to generate a true random number, which can feed into financial model simulations. In March, a team of researchers from JPMorganChase, Quantinuum, Argonne National Laboratory, Oak Ridge National Laboratory, and the University of Texas at Austin published a paper in Nature discussing a technique known as Random Circuit Sampling. RCS is used to perform a certified-randomness-expansion protocol, which outputs more randomness than it takes as input. It is a task that is often used to demonstrate quantum supremacy since it cannot be achieved on a classical computer. Speaking of the usefulness of a quantum number generator at HSBC, Intallura said: “Using quantum random numbers as your entropy source to classical simulation does not change any of the underlying model practices in classical models. You’re just injecting a different source of entropy than what we woulduse.” For Intallura, regulatory pressure and the need to ensure financial transactions are secure is helping to inform quantum computing plans at financial institutions.  The US National Institute of Standards and Technology has ratified a number of post-quantum cryptographystandards. Banks face pressure from regulators to replace RSA-2048 encryption by 2035 and migrate fully over to quantum safe encryption standards to protect banking transactions.But, as Mark Carney, lead of quantum cyber security research at Santander Global, noted, post-quantum cryptography needs both software and hardware acceleration. “We want to be able to have PQC at speed in our devices and on our payment cards,” he said. “We want to give our customers the very best cryptography that we possibly can – not just for regulatory purposes, but also because it gives a sense of assurance.” Among the promises of quantum computing is that it can be applied to solve complex optimisation problems. As and when they become commercially viable, such systems will need to work alongside traditional enterprise IT. This is something that Gerard Mullery, interim CEO of Oxford Quantum Circuits, recognised during his presentation at the event. Mullery sees a need for quantum computing to be embedded in enterprise workflows.  “As AI agents autonomously orchestrate enterprise workflows, quantum compute platforms must be designed to integrate with them,” he added. What is clear from the experts who spoke at the Commercialising Quantum Computing conference is that a useful machine is perhaps only a few years away. This will have enough logical qubits to solve real-world problems. As such devices evolve, it is likely more organisations will draw on quantum computing for certain combinatorial optimisation problems, which will need to integrate with classical computing in the datacentre. As quantum computing becomes more accessible, there will also be a need to bolster cryptography with PQC. about quantum developments Cisco lays out plans for networking in era of quantum computing: The network equipment provider has opened a new lab and developed a prototype chip as it fleshes out its quantum networking strategy. Quantum datacentre deployments: How they are supporting evolving compute projects: Quantum datacentre deployments are emerging worldwide, so what are they and where are the benefits? #how #close #quantum #computing #commercial
    WWW.COMPUTERWEEKLY.COM
    How close is quantum computing to commercial reality?
    Quantum computing may still be regarded by many IT leaders as a very niche technology, but broader business use cases may be just a few years away. While only a handful of companies have machines with logical qubits today, delegates at the Commercialising Quantum Computing conference in London were told that a machine with 100 logical qubits would offer quantum advantage in material science by 2028. This means that, by then, a sufficiently powerful and stable quantum computer would start delivering business value better than what would be possible using high performance computing. Mark Jackson, senior quantum evangelist at Quantinuum, said the company was already using generative quantum artificial intelligence (AI). In a fireside chat at the conference, Jackson spoke about the interaction between quantum computing and AI. It is largely acknowledged that a quantum computer is not good at providing a precise answer, such as if applied to big data analysis. But, according to Jackson, it shines when used for machine learning, which can be applied to identify a correct answer. Quantum-enhanced machine learning can process large datasets far quicker than conventional computers, especially when applied to detecting patterns. “Quantum computers can detect patterns that would be missed by other conventional computing methods,” said Jackson. This ability to detect patterns in massive datasets could revolutionise cyber security. Becky Pickard, managing director of global cyber operations at Barclays, pointed out during a panel discussion that a lot of progress has been made with machine learning and how to apply it on a day-to-day basis: “We’re working with massive volumes of data – 12Tbytes – on a daily basis.” She suggested that quantum machine learning could help. From an optimisation perspective, she is keen to see the development of quantum computing applied in a way that reshapes cyber defence.  HSBC is one of the organisations that has been working on quantum computing for several years. Discussing the return on investment opportunity, and how quantum computing can be used to build more optimised financial models, Phil Intallura, global head of quantum technologies at HSBC, said: “When you breakdown the opportunities, financial services is one of the biggest beneficiaries.” As Intallura points out, banks are always looking for a better financial model: “There’s one thing that catalyses commercial organisations more than anything else, and that’s confidence. If you can show a solution using quantum technology that can get a better output based than using a supercomputers, [business decision-makers] will give you much more runway than you need.” Another application area is the ability to generate a true random number, which can feed into financial model simulations. In March, a team of researchers from JPMorganChase, Quantinuum, Argonne National Laboratory, Oak Ridge National Laboratory, and the University of Texas at Austin published a paper in Nature discussing a technique known as Random Circuit Sampling (RCS). RCS is used to perform a certified-randomness-expansion protocol, which outputs more randomness than it takes as input. It is a task that is often used to demonstrate quantum supremacy since it cannot be achieved on a classical computer. Speaking of the usefulness of a quantum number generator at HSBC, Intallura said: “Using quantum random numbers as your entropy source to classical simulation does not change any of the underlying model practices in classical models. You’re just injecting a different source of entropy than what we would [normally] use.” For Intallura, regulatory pressure and the need to ensure financial transactions are secure is helping to inform quantum computing plans at financial institutions.  The US National Institute of Standards and Technology has ratified a number of post-quantum cryptography (PQC) standards. Banks face pressure from regulators to replace RSA-2048 encryption by 2035 and migrate fully over to quantum safe encryption standards to protect banking transactions.But, as Mark Carney, lead of quantum cyber security research at Santander Global, noted, post-quantum cryptography needs both software and hardware acceleration. “We want to be able to have PQC at speed in our devices and on our payment cards,” he said. “We want to give our customers the very best cryptography that we possibly can – not just for regulatory purposes, but also because it gives a sense of assurance.” Among the promises of quantum computing is that it can be applied to solve complex optimisation problems. As and when they become commercially viable, such systems will need to work alongside traditional enterprise IT. This is something that Gerard Mullery, interim CEO of Oxford Quantum Circuits, recognised during his presentation at the event. Mullery sees a need for quantum computing to be embedded in enterprise workflows.  “As AI agents autonomously orchestrate enterprise workflows, quantum compute platforms must be designed to integrate with them,” he added. What is clear from the experts who spoke at the Commercialising Quantum Computing conference is that a useful machine is perhaps only a few years away. This will have enough logical qubits to solve real-world problems. As such devices evolve, it is likely more organisations will draw on quantum computing for certain combinatorial optimisation problems, which will need to integrate with classical computing in the datacentre. As quantum computing becomes more accessible, there will also be a need to bolster cryptography with PQC. Read more about quantum developments Cisco lays out plans for networking in era of quantum computing: The network equipment provider has opened a new lab and developed a prototype chip as it fleshes out its quantum networking strategy. Quantum datacentre deployments: How they are supporting evolving compute projects: Quantum datacentre deployments are emerging worldwide, so what are they and where are the benefits?
    0 Comentários 0 Compartilhamentos 0 Anterior
CGShares https://cgshares.com