Why Doing Chain-Of-Thought Prompting In Reasoning LLMs Gums Up The...

@Forbes paylaşılan bir bağlantı

2025-02-07 22:20:51 ·

Why Doing Chain-Of-Thought Prompting In Reasoning LLMs Gums Up The Works

www.forbes.com

Decide mindfully whether to use chain-of-thought explicitly when working with the latest generative ... [+] AI reasoning models.gettyIn todays column, I examine and provide important recommendations about a crucial prompting technique that is used frequently in generative AI and large language models (LLMs). The technique is known as chain-of-thought (CoT) reasoning, which spurs AI to showcase a step-by-step series of statements about how it is solving a given problem. This is extremely valuable so that you can gauge whether the AI is logically getting the solution and potentially discern if it goes awry.The rub is this.Newer generative AI and LLMs have been adjusted to typically automatically carry out a chain-of-thought. You dont need to ask for it to happen. It just happens. Older versions didnt do this automatically. The bad news is that if you explicitly ask for a chain-of-thought, but the AI reasoning model is already implicitly going to do a chain-of-thought, the matter can falter or have other failings. The result can almost be like crossing the matter and anti-matter proton streams in the famous Ghostbusters movie, though not quite that catastrophic. Youll still live even with a double-duty chain-of-thought taking place, though the downsides can be disconcerting.Lets talk about it.This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI including identifying and explaining various impactful AI complexities (see the link here).MORE FOR YOUChain-Of-Thought As Crucial TechniqueReaders might recall that I previously posted an in-depth elicitation of over fifty prompt engineering techniques and methods, see the link here. Among those myriad approaches was the use of chain-of-thought (CoT) prompting (for my coverage of CoT and several advanced variations, see the link here and the link here).Allow me a moment to explain why chain-of-thought is so notable.Chain-of-thought is an overall common phrase often used when discussing human thinking and reasoning. A person playing a chess game might contemplate their next move. Rather than rashly moving, they are likely to also contemplate their subsequent move. In their mind, they create a kind of chain of thoughts about some number of moves and countermoves that might arise. Based on that chain of imagined steps or moves, they decide what actual move to next make.Rinse and repeat.In the AI field, these same concepts have been applied to AI systems of various kinds. For example, an AI system that plays chess will look ahead at many moves. That is partially why chess-playing games are so good these days at playing chess. Whereas a human might mentally be limited to assessing a handful of moves ahead, an AI system could look at a much greater depth by utilizing vast computational resources.Please be cautious when using the chain-of-thought phrase in the context of AI since it implies that AI can form human thoughts, which is not the case at this time. Referring to chain-of-thought in an AI context represents an unfortunate anthropomorphism of AI. Anyway, despite that qualm, the idea is that if humans think in a series or chain of thoughts, perhaps it is prudent to devise AI to work in a chain-like fashion too.Implicit Versus Explicit CoTWhen generative AI first truly went widespread, users discovered that using chain-of-thought tended to get better results. Plus, it was immensely helpful to see the explanation or series of steps that the AI undertook. You can use the steps to discern if things might have gone afoul. You can also inspect the steps to learn how to solve problems that the AI can solve. Lots of good reasons exist to invoke chain-of-thought. AI research confirmed that doing so was highly beneficial.Many who have been working with generative AI for a while are accustomed to telling the AI to make use of chain-of-thought. Saying this in a prompt is just about second nature. You dont always use it, only sparingly. Part of the reason to be mindful of the usage is that it tends to slow down the AI, delaying your response, and it can chew up server processing cycles (which, if you are paying to use the AI, could add up to a pretty penny).So far, so good.Well, AI makers figured out that automatically invoking chain-of-thought seemed to bolster the capability of generative AI and LLMs. They decided that building into the inner mechanism an automatic chain-of-thought reasoning mechanism could dramatically boost the AI. A notable example consists of OpenAI's latest ChatGPT o1 and o3 advanced AI models, see my in-depth analysis at the link here.Hoorah, AI gets better at coming up with answers.We are nowadays faced with these two possibilities:(1) Explicit invocation. A user makes an explicit chain-of-thought request in their prompt to the AI.(2) Implicit invocation. The AI is set up to implicitly proceed with a chain-of-thought that automatically happens.This, my friends, is what gets the AI world into a sticky wicket.Some Examples To Illuminate The MatterLets look at how these two approaches of implicit versus explicit invocation can go right, and how they can go badly when used at the same time.First, I will use a modern-era generative AI that automatically does a chain of thought. You dont have to ask for it. Whether you want it to occur or not, it is going to use CoT. Love it or leave it.I will compose my prompt simply and make sure to avoid asking the AI to do a chain-of-thought since I know its going to do so anyway.Here we go.My entered prompt: What is the result of dividing 462 by 3?Generative AI response: 462 divided by 3 can be calculated as follows: (a) Start by dividing the hundreds place: 400 3 = 133 with a remainder of 1; (b) Bring down the next digit, 6: 16 3 = 5 with a remainder of 1; (c) Finally, bring down the last digit, 2: 12 3 = 4; (d) The answer is 154.The AI showed me that it did a series of steps, labeled as steps a, b, c, and d. The answer of 154 is correct.Be thankful that a zillion-dollar generative AI app can do what a handheld calculator or even just an abacus can do.Ask Explicitly And See What HappensSome of the advanced AI reasoning models will rebuff a request by the user that says they want a chain-of-thought. Why would the AI be programmed to be so rude? Because the chain-of-thought is implicitly going to occur and there is no need to essentially do double duty. The AI is supposed to save you from yourself. Happy face.Lets see this occur.My entered prompt: Solve this problem by using chain-of-thought reasoning. What is the result of 462 divided by 3?Generative AI response: I am already using chain-of-thought reasoning and there is no need to ask me to do so. The response is the same as I just showed you. I will repeat it again. 462 divided by 3 can be calculated as follows: (a) Start by dividing the hundreds place: 400 3 = 133 with a remainder of 1; (b) Bring down the next digit, 6: 16 3 = 5 with a remainder of 1; (c) Finally, bring down the last digit, 2: 12 3 = 4; (d) The answer is 154.I suppose you could say this is a no-harm, no-foul situation. You still got to see a series of steps. The steps are the same ones that the AI implicitly came up with. Presumably, all is fine.Implicit Plus Explicit Can Be HandySometimes, invoking both an explicit and implicit chain-of-thought can be useful. There is a chance that you will get a more elaborate set of steps. Heres why. The implicit approach tends to be terse and not fully displayed. When you ask directly for a chain-of-thought, the AI goes out of its way to put on a real show. You have given permission to do so.Thus, you might see things you wouldnt otherwise have seen.Take a look.My entered prompt: Solve this problem by using chain-of-thought reasoning. What is the result of 462 divided by 3?Generative AI response: Sure, I will use chain-of-thought reasoning to solve this step by step. The number 462 can be broken into its place values: 400, 60, and 2. Divide each part by 3: 400 3 = 133 with a remainder of 1. Add the remainder (1) to the next digit, 6, making it 16. 16 3 = 5 with a remainder of 1. Add the remainder (1) to the last digit, 2, making it 12. 12 3 = 4. Combine the results from each step: 133 + 5 + 4 = 154. By dividing step by step, the result is 154.You can see that the combined implicit and explicit were a little bit different than the implicit alone.In this case, the difference is not especially overwhelming. It all depends on the complexity of the problem being solved. If you had a much more complicated problem, the odds are that the combined chain-of-thought will likely reveal additional interesting facets (not always, just some of the time).Confusion Can Be IntroducedWe are now to the point where I must bring you some bad news. Please prepare yourself accordingly.It is possible that by asking for both implicit and explicit invocations, wires can get crossed. In my experience, this is admittedly relatively rare. Again, it depends on several factors such as the nature of the problem being solved, plus the nature of how the AI maker has devised their AI reasoning model.I present this sadness to you.My entered prompt: Solve this problem by using chain-of-thought reasoning. What is the result of 462 divided by 3?Generative AI response: Sure, I will use chain-of-thought reasoning to solve this step by step. The number 462 can be broken into its place values: 400, 60, and 2. Divide 400 by 3 equals 0. Divide 60 by 3 equals 20. Divide 2 by 3 leaving the remainder 2. Combine the steps: 0 + 20 + 2 = 22. The result is 22.What might be happening inside the AI is that the implicit and explicit are in a tug of war. Steps can be overstepped. Steps can be missed due to the AI misarranging them. As I said, I dont think this is something to be overly worried about. It is a blue moon kind of phenomenon.Always keep your eyes and ears open.AI Hallucination Might AriseSince Im being a Gloomy Gus, I might as well also toss in the towel and mention that there is a rare upon rare chance of the double invocation landing into an AI hallucination. The gist of an AI hallucination is that sometimes the AI makes up stuff, groundless, and fictitious, but presents it as though it is true and factual. For more about dealing with AI hallucinations, see my coverage at the link here.Examine this example with your eyes half-closed so you dont bust a gasket.My entered prompt: Solve this problem by using chain-of-thought reasoning. What is the result of 462 divided by 3?Generative AI response: Sure, I will use chain-of-thought reasoning to solve this step by step. Step 1, check to see if 462 is divisible by 3. Answer: No, the number 462 is not divisible by 3. Please try a different number.The double invocation seems to have spurred an AI hallucination. The AI says that the number isnt divisible by three and therefore there is no point in continuing.My old slide ruler is laughing its head off.Recap On The Upside And DownsideThose examples hopefully gave you a visceral sense of what can take place when trying to use an explicit chain-of-thought on an AI reasoning model that is implicitly doing a chain-of-thought.Aspects to be aware of are:A double duty might be rebuffed by the AI and you cant accidentally start a troublesome spiral.A double duty might be allowed by the AI and it will work out just fine.A double duty might be allowed and add value by showing you steps that otherwise you wouldnt have seen.A double duty could cost you extra dough and take longer to process since both approaches are happening simultaneously.A double duty could inadvertently get the AI into an internal bind that leads to messing up the steps, which might or might not impact the correctness of the final answer.A double duty can stir AI into an AI hallucination that gives you a wild or strange response.And so on.If your head is spinning, which I can certainly understand, you might be puzzled as to whether you should or should not use explicit chain-of-thought when using an implicit chain-of-thought AI model.Im glad you asked.I have some sound advice to give you.Rightly Dealing With The Double DutyMy overarching recommendation in my classes on prompt engineering is that you need to be mindfully cautious about using chain-of-thought from now on. Heres how to do so.First, try to determine whether the AI that you are using is actively doing an implicit chain-of-thought.You can usually readily tell since the AI will be listing out steps when responding to your prompts. If you dont see any steps, that doesnt mean you are in the clear. Some of the models have a switch that turns the showing of steps on and off. It could be that the switch is turned off. My suggestion is that if you dont see any steps, go ahead and check the help files of the generative AI to see what the documentation says. You can also ask the AI to tell you whether it is implicitly doing chain-of-thought.Id suggest taking both routes to verify each.Second, if the AI is doing an implicit chain-of-thought, you are in the drivers seat as to whether to ask for an explicit chain-of-thought.I usually do some experimentation to see what happens when I ask for chain-of-thought on an AI model that is implicitly doing so. Use some simple problems where you already know the answers and see what happens. Use some complex problems where you know the answers and see what happens. After a few trial runs, you can usually gauge whether the double duty is going to work out or whether it is a dismal idea.Third, a key factor is the nature of the problems or questions that you intend to use with AI. I would doubt that explicitly asking for a chain-of-thought with easy or simpler problems is worthwhile. The explanation probably wont be much better, and the cost and delay might be disconcerting.The potential benefit can kick into gear when you have a highly complex problem. There is a reasonable chance that double duty could actually aid the solving of the problem. You also then have a likely longer explanation that you can inspect to see what the AI allegedly did to solve the problem.Words Of Wisdom On Thorny TopicsI would offer this further possibility.For your complex questions or problems, you can try with the implicit alone and then try a second time with your explicitly requesting a chain-of-thought. Compare the two.I know that doing this with any frequency might be problematic because you could be racking up a lot of added charges by doing the problem twice. I leave that to you to determine whether the added cost is suitable for you. I also realize that the AI maker will relish you running your prompts twice since it adds to their bottom line. Sorry, thats the way the ball bounces.A final thought for now.Marcus Tullius Cicero, the great Roman statesman, made this assertion: The first duty of humanity is the seeking after and the investigation of truth.Your duty in the case of using generative AI is to figure out the truth of what will happen if you use an explicit chain-of-thought when an implicit chain-of-thought is already headed in your direction. Find the truth. Its there somewhere.I anoint you as a badge-carrying truth-seeker in this regard. Good luck.

0 Yorumlar ·0 hisse senetleri ·60 Views

Upgrade to Pro