When Luna was seven months old, she commenced wearing a vibrant pink helmet with a camera on top, as per her father’s request. This camera would record everything Luna observed, heard, and vocalized for approximately an hour each time.
Luna’s father, Brenden Lake, a cognitive scientist at New York University focusing on enhancing artificial intelligence training methods, also plays the role of a father at home. During a recent Sunday morning, while Luna was engrossed in playing with her wooden toys, Lake engaged her by presenting a robot puppet and inquiring, “Is this for the robot?” in a playful Muppet voice. Luna initially showed mild interest, typical of young children absorbed in their own world. However, a few minutes later, she returned to pick up the puppet and clearly stated, “Robot,” reaffirming her understanding. This surprised her father, as he had never heard her use the word “robot” before, prompting him to question if Luna had just learned the word.
At eighteen months old, Luna has already mastered a skill that current AI models struggle with. Humans can learn from minimal examples, solidifying associations with just a single encounter, such as linking a silver hand puppet to the word “robot.” In contrast, artificial intelligence often requires numerous examples for learning; sophisticated language models like the one behind ChatGPT are trained on an immense amount of data, involving hundreds of billions, if not trillions, of words. Lake mentioned that processing a word count of such magnitude would take a millennium. Given that humans learn language more efficiently with less time and exposure to words, the question arises: Can AI be trained more akin to how toddlers learn?
These inquiries motivated Lake to document his daughter’s early experiences. Luna is part of the BabyView study at Stanford, along with around 25 other infants. This project aims to capture the visual and auditory stimuli crucial for language acquisition during the rapid learning phase in early childhood. The objective is to leverage Luna’s and other participants’ data to enhance AI training methods and gain insights into how children master language.
Recent advancements in artificial intelligence and hardware have provided researchers with new avenues to explore developmental psychology. Miniature cameras and microphones now allow infants to wear recording devices comfortably for extended periods, including at home. The BabyView study, spearheaded by Michael Frank from Stanford, builds on earlier initiatives like SAYCam, which involved head cameras on babies to track their development and gather valuable research data sets.
Lake’s involvement in BabyView stems from his interest in utilizing the SAYCam data to train AI models. His team at NYU demonstrated in a notable paper published in Science that even AI models trained on a fraction of a baby’s visual and auditory experiences could successfully classify objects like balls, cats, and cars. While these models do not replicate how toddlers learn in reality due to the limited sensory input, they serve as a proof of concept for AI learning strategies inspired by child development.
Lake aims to imbue artificial intelligence with some of the innate learning strategies observed in children, such as the ability to infer the meaning of new words based on context. By drawing parallels between AI models and children’s learning behaviors, researchers like Lake and his team seek to unlock new insights into both artificial and human intelligence.
Sarah Zhang, a staff writer at The Atlantic, explores the fascinating intersection of artificial intelligence research and developmental psychology, shedding light on the quest to bridge the gap between AI learning mechanisms and the remarkable abilities of young children.