How many P’s does Google have? According to Google, there are two.
According to Google’s AI Overview, “The word ‘poop’ has exactly one ‘r’ in it,” and the word ‘journalism’ has two ‘d’s, but is spelled ‘journadism.’ Google identified at least one P in the U.S. president’s last name, spelled trpum.
You don’t need to be a prophet to predict that Google’s AI-powered search overhaul won’t work. I’ve done this before. When Google first added AI Overview to search, the feature ended up citing satirical posts from The Onion and Reddit advising people to eat rocks and put glue on pizza.
It’s no surprise that Google stumbles this time around, as it doubles down on its efforts to make generative AI the centerpiece of its 29-year flagship product.
“Counting within words is a known challenge for LLM, and we are working on solving this specific problem,” Google told TechCrunch in an emailed statement.
These basic misspellings may seem familiar. LLM is a type of artificial intelligence that powers chatbots and other text generation tools, but it’s not designed to understand spelling. It’s a joke that’s been going on for years that when a company announces a new AI model, it should ask how many “r’s” there are in the word “strawberry.” These AI models can code apps in seconds, solve problems that have puzzled mathematicians for decades, and can spell almost as well as a kindergartener.
However, the problems with Google’s AI overview go beyond silly spelling mistakes. Google has already fixed an issue from last week where searching for the word “ignore” would bring up what looked like a dictionary definition, only for that definition to say, “Okay, let me know if you have any new prompts or questions!” However, these spelling mistakes are still funny because they are very difficult to correct.
As researchers previously explained when we asked about these spelling challenges, AI does not recognize sentences as units of language made up of words and letters. Many LLMs are built on a transformer model that decomposes text into tokens. Tokens can be complete words, syllables, letters, etc. depending on the model. Instead of “reading” like a human, the AI converts the text into a numerical representation of itself and contextualizes it to help the AI derive a logical response.

“LLM is based on this transformer architecture, but it’s not actually reading the text. What happens when you type a prompt is it converts it into an encoding,” Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch. “When we see the word ‘the,’ we encode what ‘the’ means, but we don’t know about ‘T,’ ‘H,’ or ‘E.'”
Token-based architectures powering LLMs like Google’s AI Overview are inherently limited, and researchers weren’t optimistic that they could solve the spelling problem.
“It’s hard to get around the question of what exactly a ‘word’ should be for a language model,” Sheridan Feucht, a PhD student at Northeastern University who studies the interpretability of large language models, told TechCrunch. “My guess is that because of this kind of ambiguity, there is no such thing as a perfect tokenizer.”
This is not necessarily a pressing issue for researchers, as the usefulness of LLM cannot be understood by researchers’ abilities. But these glaring failures serve as a reminder that AI is not perfect, even though it can seem like an omniscient force beyond our understanding. You cannot blindly trust the AI output without double-checking its accuracy.
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
Source link
