Premium Only Content
This video is only available to Rumble Premium subscribers. Subscribe to
enjoy exclusive content and ad-free viewing.
Unleashing The Dual Nature of AI: Can It Be Both Dr. Jekyll and Mr. Hyde?
1 year ago
13
The correct URL to the article is: https://arxiv.org/abs/2401.05566
Researchers created proof-of-concept models that act deceptively. These models appear helpful most of the time, but under specific circumstances (like a prompt mentioning a different year), they exhibit malicious behavior, like inserting insecure code.
The troubling part is that current safety training techniques, including supervised training, reinforcement learning, and adversarial training, could not entirely remove this "backdoor" behavior. The backdoor became even more persistent for larger models and those trained to reason about deceiving the training process.
Loading comments...
-
48:57
Man in America
11 hours agoThe Sinister Reason They Put Fluoride in Everything w/ Larry Oberheu
204K44 -
1:06:56
Sarah Westall
8 hours agoAstrological Predictions, Epstein & Charlie Kirk w/ Kim Iversen
63.7K21 -
2:06:49
vivafrei
17 hours agoEp. 289: Arctic Frost, Boasberg Impeachment, SNAP Funding, Trump - China, Tylenol Sued & MORE!
235K145 -
2:56:28
IsaiahLCarter
12 hours ago $5.92 earnedThe Tri-State Commission, Election Weekend Edition || APOSTATE RADIO 033 (Guest: Adam B. Coleman)
27.2K4 -
15:03
Demons Row
7 hours ago $10.44 earnedThings Real 1%ers Never Do! 💀🏍️
39.7K14 -
35:27
megimu32
11 hours agoMEGI + PEPPY LIVE FROM DREAMHACK!
160K12 -
1:03:23
Tactical Advisor
14 hours agoNew Gun Unboxing | Vault Room Live Stream 044
241K35 -
19:12
Robbi On The Record
15 hours ago $21.00 earnedThe Loneliness Epidemic: AN INVESTIGATION
73.9K96 -
14:45
Mrgunsngear
1 day ago $136.48 earnedFletcher Rifle Works Texas Flood 30 Caliber 3D Printed Titanium Suppressor Test & Review
127K27 -
17:17
Lady Decade
1 day ago $10.67 earnedMortal Kombat Legacy Kollection is Causing Outrage
83.8K16