Premium Only Content
This video is only available to Rumble Premium subscribers. Subscribe to
enjoy exclusive content and ad-free viewing.

Unleashing The Dual Nature of AI: Can It Be Both Dr. Jekyll and Mr. Hyde?
11 months ago
13
The correct URL to the article is: https://arxiv.org/abs/2401.05566
Researchers created proof-of-concept models that act deceptively. These models appear helpful most of the time, but under specific circumstances (like a prompt mentioning a different year), they exhibit malicious behavior, like inserting insecure code.
The troubling part is that current safety training techniques, including supervised training, reinforcement learning, and adversarial training, could not entirely remove this "backdoor" behavior. The backdoor became even more persistent for larger models and those trained to reason about deceiving the training process.
Loading comments...
-
LIVE
Badlands Media
1 hour agoDevolution Power Hour Ep. 382
23,320 watching -
Inverted World Live
5 hours agoDon't Approach the Zombie Rabbits | Ep. 95
22.4K7 -
LIVE
Drew Hernandez
1 hour agoISRAEL PLANNING POSSIBLE DRAFT IN USA & TRUMP'S VIEW ON ETERNAL LIFE ANALYZED
1,019 watching -
3:08:07
TimcastIRL
4 hours agoTexas Republicans Win, House Passes Redistricting Map, GOP Looks To Gain 5 Seats | Timcast IRL
144K52 -
1:30:34
FreshandFit
4 hours agoHow To Stay Focused While Pursuing Women...The Good, The Bad, And The Ugly
30.9K20 -
1:47:05
Drew Hernandez
8 hours agoISRAEL PLANNING POSSIBLE DRAFT IN USA & TRUMP'S VIEW ON ETERNAL LIFE ANALYZED
20.2K55 -
29:55
Afshin Rattansi's Going Underground
3 days agoProf. Omer Bartov: The REAL REASON the US, UK, and EU Have Not Recognised Israel’s Genocide in Gaza
14.6K24 -
LIVE
SpartakusLIVE
6 hours agoWednesday WZ with the Challenge MASTER || Duos w/ GloryJean
407 watching -
2:36:12
Barry Cunningham
5 hours agoREACTING TO STEPHEN MILLER | KASH PATEL | TULSI GABBARD INTERVIEWS AND MORE NEWS!
62.3K60 -
LIVE
Alex Zedra
3 hours agoLIVE! Solo Scary Game night
302 watching