Written by 8:04 am Generative AI

Gen AI fails in this very common ability of human thought

OpenAI’s GPT-3.5 and GPT-4 both fell down when challenged with greater and greater levels of …

The famous “matryoshka” dolls, where every doll has another doll nested inside of it, and another and another, is a classic example of the principle of recursion, where the thing contains itself. People instantly recognize such examples of recursion, but generative artificial intelligence has no concept of it.

Tiernan Ray for ZDNET

Among the sublime concepts humans grasp intuitively is the concept of recursion, where some thing contains a kind of repetition of itself stretching out to infinity. Any “movie within a movie” could be an example of recursion. The famous Matryoshka “nesting dolls” from Russia, where each wooden doll opens to reveal a smaller doll, is an endearing example of recursion.

A picture of yourself holding a picture of yourself in which you are holding a picture of yourself, ad infinitum, is a classic recursive visual fantasy that creates delight in anyone.

But artificial intelligence programs, including generative AI programs such as ChatGPT, cannot deal with recursion, at least, not consistently, according to scholars at the University of Illinois Urbana-Champaign. The inability to handle recursion hampers AI’s performance in tasks. In particular, AI can fail in programming tasks where some bit of code repeats a smaller, atomic version of the same code.

“For example, models trained for programming tasks may misunderstand programs with nested blocks or produce imbalanced brackets,” write lead author Shizhuo Dylan Zhang of U of Illinois and colleagues, in the paper, “Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion,” posted on the arXiv pre-print server this month.

To test large language models in order of examine how Gen AI gets stuck by recursion, the authors devise a simple test. Given a piece of code such as adding numbers recursively, can a large language model produce a description of the pattern of recursion in the code.

Zhang and team choose a classic programming task, called a tree traversal. A data type in computing is a tree type if it has branching parts that lead to further branches, a kind of tree that contains more trees. To “traverse” the tree means to inspect each part of the tree in a certain order.

Into the language model, in this case GPT-3.5 “Turbo” and GPT-4, OpenAI’s most powerful models, they enter a prompt that has the problem statement and an example of the tree traversal and an instruction to perform tree traversal. The output from the GPTs should be the solution to the tree traversal and also an explanation of the rule used to perform that traversal.

What they find is the programs start to fail in solving the traversal as the tree gets bigger, or, in computing terms, has greater depth. “The performance of large language models (LLMs) exhibits a significant decline as the tree depth increases, correlating with a rise in the complexity and number of reduction steps required,” they write.

A qualitative analysis, they write, reveals that the language models cannot carry out the right “reduction,” meaning, replacing some element in the tree with the recursive element it contains, in order to successfully continue the traversal. “This type of error underscores the challenges LLMs face in maintaining algorithmic consistency over extended sequences, particularly when the task requires adherence to a precise order of operations,” they write.

Not only do the large language models fail to properly solve some traversals, but the programs fail to explain, given an example in the prompt, what the rule is that might be used for a successful traversal.

“LLMs struggle to find the correct rules from in-context examples,” write Zhang and team, “and fail in interesting ways when performing in-context step-wise reduction.”

What does it mean that the programs can’t handle recursion?

At a level, lacking an ability to work with recursion speaks to a lack of logic, write Zhang and team. “Performing each reduction step in a tree traversal requires more sophisticated logical reasoning,” and that’s just not in the programs.

However, even humans who don’t have a very good grasp of formal logic can instantly appreciate the kind of recursion that occurs in the “movie within a movie” type and the other examples.

On a much simpler level, the design of the GPTs needs to be reconsidered by their creators, the authors suggest.

“We hypothesize the fundamental cause [of failures] is that the model has not been optimized to represent recursive patterns well enough,” they write. “Our in-context learning experiments revealed the lack of ability for LLMs to ‘think recursively’ as it tends to mine non-recursive rules from data.

“Also, the models lack accuracy when performing step-wise reduction, where they may get lost in the middle or fail to apply the stopping condition.”

Left out of the study are all the ways that the inability to handle recursion may hamper the performance of Gen AI on various tasks, from essay writing to computer programming. It’s true that, as Zhang and team note, the programs are able to compensate, up to a point, with little tricks they pick up.

And yet, recursion is such a fundamental phenomenon that at some point, its absence is bound to have consequences for the programs.

Visited 4 times, 1 visit(s) today
Tags: Last modified: April 20, 2024
Close Search Window
Close