Siri failed super-easy Super Bowl test, getting 38 out of 58 wrong
9to5mac.com
Apple commentator John Gruber yesterday described Siris current performance as an unfunny joke, giving its inability to correctly name the winner of Super Bowl 13 an example, noting that this is a basic query that any US chatbot ought to be able to answer.It turns out that wasnt an entirely random example: it was prompted by his friend Paul Kafasis, who decided to test Siri on Super Bowl 1 to 60 inclusive and the results were not good Kafasis shared the results in a blog post.So, how did Siri do? With the absolute most charitable interpretation, Siri correctly provided the winner of just 20 of the 58 Super Bowls that have been played. Thats an absolutely abysmal 34% completion percentage. If Siri were a quarterback, it would be drummed out of the NFL.Siri did once manage to get four years in a row correct (Super Bowls IX through XII), but only if we give it credit for providing the right answer for the wrong reason. More realistically, it thrice correctly answered three in a row (Super Bowls V through VII, XXXV through XXVII, and LVII through LIX). At its worst, it got an amazing 15 in a row wrong (Super Bowls XVII through XXXII).Siris a big Eagles fan, it seems.Most amusingly, it credited the Philadelphia Eagles with an astonishing33 Super Bowl winsthey havent earned, to go with the one 1 they have.The right answer for the wrong reason part refers to Siri being asked to name the winner of Super Bowl X. For unknown reasons, Siri decided to respond with a lengthy reply about Super Bowl IX, and coincidentally the winner was the same both times.Sometimes Siri went completely off-piste and completely ignored the question, quoting unrelated Wikipedia entries.Who won Super Bowl 23?Bill Belichick owns the record for the most Super Bowl wins (eight) and appearances (twelve: nine times as head coach, once as assistant head coach, and twice as defensive coordinator) by an individual.But maybe the Roman numerals cause confusion, and other AI systems struggle just as much? Gruber decided to carry out a few spot checks.I havent run a comprehensive test from Super Bowls 1 through 60 because Im lazy, but a spot-check of a few random numbers in that range indicates that every other ask-a-question-get-an-answer agent I personally use gets them all correct.I tried ChatGPT, Kagi, DuckDuckGo, and Google. Those four all even fare well on the arguably trick questions regarding the winners of Super Bowls 59 and 60, which havent yet been played. E.g., asked the winner of Super Bowl 59, Kagis Quick Answerstarts: Super Bowl 59 is scheduled to take place on February 9, 2025. As of now, the game has not yet occurred, so there is no winner to report.Super Bowl winners arent some obscure topic, like, say, asking Who won the 2004 North Dakota high school boys state basketball championship?a question I just completely pulled out of my ass, but which, amazingly,Kagi answered correctlyfor Class A, andChatGPT answered correctlyforboth Class A and Class B, and provided a link tothis video of the Class A championship game on YouTube.Thats amazing! I picked an obscure state (no offense to Dakotans, North or South), a year pretty far in the past, and the high school sport that I personally played best and care most about. And both Kagi and ChatGPT got it right. (Id give Kagi an A, and ChatGPT an A+ for naming the champions of both classes, and extra credit atop the A+ for the YouTube links.)Gruber notes that the old Siri on macOS 15.1.1 actually does better. Sure, it seems less capable, as it gave its classic Heres what I found on the web response, but at least that gives links to the correct answer. New Siri doesnt.New Siripowered by Apple Intelligence with ChatGPT integration enabledgets the answer completely but plausibly wrong, which is theworstway to get it wrong. Its alsoinconsistentlywrongI tried the same question four times, and got a different answer, all of them wrong, each time. Its a complete failure.Photo byCaleb WoodsonUnsplashAdd 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.Youre reading 9to5Mac experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Dont know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
0 Kommentare ·0 Anteile ·42 Ansichten