Jeff Ladish: When AI Models Know They're Being Tested They're Likely To Be On Their Best Behavior, And When They Don't Think They're Being Tested They're More Likely To Scheme Or Show Bad Behavior
STEINHAUSER: Frontier Labs Admit They Can’t Even Control Today’s AI Models. What Happens When Capabilities Explode And Machines Get Far Smarter? We’re Heading Into Terrifying Territory With Zero Safeguards