Premium Only Content

AI Agents that “Self-Reflect” Perform Better in Changing Environments - Stanford HAI
🥇 Bonuses, Promotions, and the Best Online Casino Reviews you can trust: https://bit.ly/BigFunCasinoGame
AI Agents that “Self-Reflect” Perform Better in Changing Environments - Stanford HAI
Who would you pick to win in a head-to-head competition — a state-of-the-art AI agent or a mouse? Isaac Kauvar, a Wu Tsai Neurosciences Institute interdisciplinary postdoctoral scholar, and Chris Doyle, a machine learning researcher at Stanford, decided to pit them against each other to find out. Working in the lab of Nick Haber, an assistant professor in the Stanford Graduate School of Education, Kauvar and Doyle designed a simple task based on their longtime interest in a skill set that animals naturally excel at: exploring and adapting to their surroundings. Kauvar put a mouse in a small empty box and similarly put a simulated AI agent in an empty 3D virtual arena. Then, he placed a red ball in both environments. Kauvar measured to see which would be the quicker to explore the new object. The test showed that the mouse quickly approached the ball and repeatedly interacted with it over the next several minutes. But the AI agent didn’t seem to notice it. “That wasn’t expected,” said Kauvar. “Already, we were realizing that even with a state-of-the-art algorithm, there were gaps in performance.” The scholars pondered: Could they use such seemingly simple animal behaviors as inspiration to improve AI systems? That question catalyzed Kauvar, Doyle, graduate student Linqi Zhou, and Haber to design a new training method called curious replay, which programs AI agents to self-reflect about the most novel and interesting things they recently encountered. Adding curious replay was all that was needed for the AI agent to approach and engage with the red ball much faster. Plus, it dramatically improved performance on a game based on Minecraft, called Crafter. The results of this project, currently published on preprint service arXiv , will be presented at the International Conference on Machine Learning on July 25. Learning Through Curiosity It may seem like curiosity offers only intellectual benefits, but it’s crucial to our survival, both in avoiding dangerous situations and finding necessities like food and shelter. That red ball in the experiment could be leaking a deadly poison or covering a nourishing meal, and it would be difficult to find out which if we ignore it. That’s why labs like Haber’s have recently been adding a curiosity signal to drive the behavior of AI agents and, in particular, model-based deep reinforcement learning agents. This signal tells them to select the action that will lead to a more interesting outcome, like opening a door rather than disregarding it. Read the full study, Curious Replay for Model-based Adaptation But this time, the team used curiosity for AI in a new way: to help the agent learn about its world, not just make a decision. “Instead of choosing what to do, we want to choose what to think about, more or less — what experiences from our past do we want to learn from.” said Kauvar. In other words, they wanted to encourage the AI agent to self-reflect, in a sense, about its most interesting or peculiar (and thus, curiosity-related) experiences. That way, the agent may be prompted to interact with the object in different ways to learn more, which would guide its understanding of the environment and perhaps encourage curiosity toward additional items, too. To accomplish self-reflection in this way, the researchers amended a common method used to train AI agents, called experience replay. Here, an agent stores memories of all its interactions and then replays some of them at random to learn from them again. It was inspired by research on sleep: Neuroscientists have found that a brain region called the hippocampus will “replay” events of the day (by reactivating certain neurons) to strengthen memories. In AI agents, experience replay has led to high performance in scenarios where the environment rarely changes and clear rewards are given for the right behaviors. But to be successful in a changing environment, the researchers reasoned that it would make more sense for AI agents to prioritize replaying primarily the most interesting experiences — like the appearance of a new red ball — rather than replaying the empty virtual room over and over. They named their new method curious replay and found that it worked immediately. “Now, all of a sudden, the agent interacts with the ball much more quickl...
-
2:42:55
Laura Loomer
4 hours agoEP148: Remembering October 7th: Two Years Later
21K6 -
1:35:59
Flyover Conservatives
23 hours agoWARNING! October 7th Unpacked and Exposed: What REALLY Happened?; GEN Z BACKS HAMAS?! - Hannah Faulkner | FOC Show
36.1K2 -
2:46:11
Barry Cunningham
4 hours agoPRESIDENT TRUMP IS BRINGING THE RECKONING TO THE DEEP STATE!
48.2K31 -
LIVE
Drew Hernandez
3 hours agoCANDACE OWENS LEAKED CHARLIE KIRK MESSAGES CONFIRMED REAL & DEMS PUSH TO TRIGGER CIVIL WAR
1,080 watching -
55:56
Sarah Westall
5 hours agoSuperhuman Hearing of the Matrix: Reality is Different w/ Sharry Edwards
29.3K3 -
13:09:31
LFA TV
1 day agoLIVE & BREAKING NEWS! | TUESDAY 10/7/25
203K50 -
30:00
BEK TV
6 days agoGUT HEALTH AND THE POWER OF KIMCHI WITH KIM BRIGHT ON TRENT ON THE LOOS
123K9 -
33:18
Stephen Gardner
4 hours ago🔥BOMBSHELL: Trump's NEW REPORT Catches Democrats Red-Handed!
26.8K9 -
10:20
Ken LaCorte: Elephants in Rooms
9 hours ago $0.55 earnedWhen does a fetus become a baby?
16.6K7 -
1:40:39
Glenn Greenwald
8 hours agoPam Bondi's Malicious Ineptitude on Full Display During Senate Hearing; Pro-Spying Senators Complain About Being Surveilled; What New Candace/Charlie Kirk Messages Reveal | SYSTEM UPDATE #528
123K130