Apple is the latest company to get pwned by AI
www.computerworld.com
Its happened yet again this time to Apple.Apple recently had to disable AI-generated news summaries in its News app in iOS 18.3. You can guess why: the AI-driven Notification Summaries for the news and entertainment categories in the app occasionally hallucinated, lied, and spread misinformation.Sound familiar?Users complained about the summaries, but Apple acted only after a complaint from BBC News, which told Apple that several of its notifications were improperly summarized.These were major errors in some cases.The generative AI (genAI) tool incorrectly summarized a BBC headline, falsely claiming that Luigi Mangione, who was charged with murdering UnitedHealthcare CEO Brian Thompson, had shot himself. It inaccurately reported that Luke Littler had won the PDC World Darts Championship hours before the competition had even begun and falsely claimed that Spanish tennis star Rafael Nadal had come out as gay.Apple summarized other real stories with false information: The tool said Israeli Prime Minister Benjamin Netanyahu had been arrested, Pete Hegseth had been fired, that Trump tariffs had triggered inflation (before Donald Trump had re-assumed office), and spewed dozens of other falsehoods.Apple rolled out the feature not knowing it would embarrass the company and force a retreat which is amazing when you consider that this happens to every other company that tries to automate genAI information delivery of any kind on a large scale.Microsoft Starts travel section, for example, published an AI-generated guide for Ottawa that included the Ottawa Food Bank as a tourist hotspot, encouraging visitors to come on an empty stomach.In September 2023, Microsofts news portal MSN ran an AI-generated obituary for former NBA player Brandon Hunter, who had passed away at the age of 42. The obituary headline called Hunter useless at 42, while the body of the text said that Hunter had performed in 67 video games over two seasons.Microsofts news aggregator, MSN, attached an inappropriate AI-generated poll to aGuardianarticle about a womans death.The poll asked readers to guess the cause of death, offering options like murder, accident, or suicide.During its first public demo in February 2024, Googles Bard AI incorrectly claimed that the James Webb Space Telescope had taken the first pictures of a planet outside our solar systemsome 16 years after the first extrasolar planets were photographed.These are just a few examples out of many.The problem: AI isnt humanThe Brandon Hunter example is instructive. The AI knows enough about language to know that a person who does something is useful, that death means they can no longer do that thing, and that the opposite of useful is useless. But AI does not have a clue that saying in an obituary that a persons death makes them useless is problematic in the extreme.Chatbots based onLarge Language Models(LLMs) are inherently tone-deaf, ignorant of human context, and cant tell the difference between fact and fiction, between truth and lies. They are, for lack of a better term, sociopaths unable to tell the difference between the emotional impact of an obituary and a corporate earnings report.There are several reasons for errors.LLMs are trained on massive datasets that contain errors, biases, or inconsistencies. Even if the data is mostly reliable, it may not cover all possible topics a model is expected to generate content about, leading to gaps in knowledge. Beyond that, LLMs generate responses based on statistical patterns using probability to choose words rather than understanding or thinking. (Theyve been described as next-word prediction machines.)The biggest problem, however, is that AI isnt human, sentient, or capable of thought.Another problem: People arent AIMost people dont pay attention to the fact that we dont actually communicate with complete information. Heres a simple example: If I say to my neighbor, Hey, whats up? My neighbor is likely to reply, Not much. You?A logic machine would likely respond to that question by describing the layers of the atmosphere, satellites, and the planets and stars beyond. It answered the question factually as it was asked, but the literal content of the question did not contain the actual information sought by the asker.To answer that simple question in the manner expected, a person has to be a human who is part of a culture and understands verbal conventions or has to be specifically programmed to respond to such conventions with the correct canned response.When we communicate, we rely on shared understanding, context, intonation, facial expression, body language, situational awareness, cultural references, past interactions, and many other things. This varies by language. The English language is one of the most literally specific languages in the world, and so a great many other languages will likely have bigger problems with human-machine communication.Our human conventions for communication are very unlikely to align with genAI tools for a very long time. Thats why frequent AI chatbot users often feel like the software sometimes willfully evades their questions.The biggest problem: Tech companies can be hubristicWhats really astonishing to me is that companies keep doing this. And by this, I mean rolling out unsupervised automated content-generating systems that deliver one-to-many content on a large scale.And scale is precisely the difference.If a single user prompts ChatGPT and gets a false or ridiculous answer, they are likely to shrug and try again, sometimes chastising the bot for its error, for which the chatbot is programmed to apologize and try again. No harm, no foul.But when an LLM spits out a wrong answer for a million people, thats a problem, especially in Apples case, where no doubt many users are just reading the summary instead of the whole story. Wow, Israeli Prime Minister Benjamin Netanyahu was arrested. Didnt see that coming, and now some two-digit percentage of those users are walking around believing misinformation.Each tech company believes they have better technology than the others.Google thought: Sure, that happened to Microsoft, but our tech is better.Apple thought: Sure, it happened to Google, but our tech is better. Tech companies: No, your technology is not better. The current state of LLM technology is what it is and we have definitely not reached the point where genAI chatbots can reliably handle a job like this.What Apples error teaches usTheres a right way and a wrong way to use LLM-based chatbots. The right way is to query with intelligent prompts, ask the question in several ways, and always fact-check the responses before using or believing that information.Chatbots are great for brainstorming, providing quick information that isnt important, or being a mere starting point for research that leads you to legitimate sources.But using LLM-based chatbots to write content unsupervised at scale? Its very clear that this is the road to embarrassment and failure.The moral of the story is that genAI is still too unpredictable to reliably represent a company in one-to-many communications of any kind at scale.rrorSo, make sure this doesnt happen with any project under your purview. Setting up any public-facing content-producing project meant to communicate information to large numbers of people should be a hard, categorical no until further notice.AI is not human, cant think, and it will confuse your customers and embarrass your company if you give it a public-facing role.
0 Commentaires
·0 Parts
·54 Vue