Premium Only Content

Author Interview - ACCEL: Evolving Curricula with Regret-Based Environment Design
#ai #accel #evolution
This is an interview with the authors Jack Parker-Holder and Minqi Jiang.
Original Paper Review Video: https://www.youtube.com/watch?v=povBD...
Automatic curriculum generation is one of the most promising avenues for Reinforcement Learning today. Multiple approaches have been proposed, each with their own set of advantages and drawbacks. This paper presents ACCEL, which takes the next step into the direction of constructing curricula for multi-capable agents. ACCEL combines the adversarial adaptiveness of regret-based sampling methods with the capabilities of level-editing, usually found in Evolutionary Methods.
OUTLINE:
0:00 - Intro
1:00 - Start of interview
4:45 - How did you get into this field?
8:10 - What is minimax regret?
11:45 - What levels does the regret objective select?
14:20 - Positive value loss (correcting my mistakes)
21:05 - Why is the teacher not learned?
24:45 - How much domain-specific knowledge is needed?
29:30 - What problems is this applicable to?
33:15 - Single agent vs population of agents
37:25 - Measuring and balancing level difficulty
40:35 - How does generalization emerge?
42:50 - Diving deeper into the experimental results
47:00 - What are the unsolved challenges in the field?
50:00 - Where do we go from here?
Website: https://accelagent.github.io
Paper: https://arxiv.org/abs/2203.01302
ICLR Workshop: https://sites.google.com/view/aloe2022
Book on topic: https://www.oreilly.com/radar/open-en...
Abstract:
It remains a significant challenge to train generally capable agents with reinforcement learning (RL). A promising avenue for improving the robustness of RL agents is through the use of curricula. One such class of methods frames environment design as a game between a student and a teacher, using regret-based objectives to produce environment instantiations (or levels) at the frontier of the student agent's capabilities. These methods benefit from their generality, with theoretical guarantees at equilibrium, yet they often struggle to find effective levels in challenging design spaces. By contrast, evolutionary approaches seek to incrementally alter environment complexity, resulting in potentially open-ended learning, but often rely on domain-specific heuristics and vast amounts of computational resources. In this paper we propose to harness the power of evolution in a principled, regret-based curriculum. Our approach, which we call Adversarially Compounding Complexity by Editing Levels (ACCEL), seeks to constantly produce levels at the frontier of an agent's capabilities, resulting in curricula that start simple but become increasingly complex. ACCEL maintains the theoretical benefits of prior regret-based methods, while providing significant empirical gains in a diverse set of environments. An interactive version of the paper is available at this http URL.
Authors: Jack Parker-Holder, Minqi Jiang, Michael Dennis, Mikayel Samvelyan, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/2017636191
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
-
30:09
Afshin Rattansi's Going Underground
15 hours ago‘Gaza Will Haunt Israel for Generations’- Mika Almog Granddaughter of Former President Shimon Peres
5563 -
15:36
Nikko Ortiz
11 hours agoBring Back Public Shaming...
3.29K5 -
2:43:41
Side Scrollers Podcast
17 hours agoAsmongold Says The Online Left Are “ANIMALS” + Hasan Collar-Gate Gets WORSE + More | Side Scrollers
8.28K9 -
1:33:41
Dinesh D'Souza
2 days agoThe Dragon's Prophecy Film
79.1K58 -
LIVE
Lofi Girl
2 years agoSynthwave Radio 🌌 - beats to chill/game to
227 watching -
44:08
The Why Files
6 days agoThe CIA, Men in Black and the Plot to Take Out JFK | The Maury Island Incident
43.3K63 -
2:07:23
TimcastIRL
8 hours agoTrump SLAMS China With NEW 100% Tariff, Stocks & Crypto TUMBLE | Timcast IRL
285K147 -
5:15:25
SpartakusLIVE
8 hours agoBF6 LAUNCH DAY || WZ and BF6 followed by PUBG - The PERFECT Combo?
58.1K3 -
1:33:59
Glenn Greenwald
10 hours agoQ&A with Glenn: Is the Gaza Peace Deal Real? Why was the Nobel Peace Prize Given to Venezuela's Opposition Leader? And More... | SYSTEM UPDATE #529
114K66 -
1:24:01
Flyover Conservatives
1 day agoURGENT FINANCIAL UPDATE! October 14–31: The Great and Terrible Day Has Arrived - Bo Polny; 5 Mindsets You Must Master - Clay Clark | FOC Show
52.3K5