Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

2 years ago
13

#deepmind #rl #society

This is an in-depth paper review, followed by an interview with the papers' authors!
Society is ruled by norms, and most of these norms are very useful, such as washing your hands before cooking. However, there also exist plenty of social norms which are essentially arbitrary, such as what hairstyles are acceptable, or what words are rude. These are called "silly rules". This paper uses multi-agent reinforcement learning to investigate why such silly rules exist. Their results indicate a plausible mechanism, by which the existence of silly rules drastically speeds up the agents' acquisition of the skill of enforcing rules, which generalizes well, and therefore a society that has silly rules will be better at enforcing rules in general, leading to faster adaptation in the face of genuinely useful norms.

OUTLINE:
0:00 - Intro
3:00 - Paper Overview
5:20 - Why are some social norms arbitrary?
11:50 - Reinforcement learning environment setup
20:00 - What happens if we introduce a "silly" rule?
25:00 - Experimental Results: how silly rules help society
30:10 - Isolated probing experiments
34:30 - Discussion of the results
37:30 - Start of Interview
39:30 - Where does the research idea come from?
44:00 - What is the purpose behind this research?
49:20 - Short recap of the mechanics of the environment
53:00 - How much does such a closed system tell us about the real world?
56:00 - What do the results tell us about silly rules?
1:01:00 - What are these agents really learning?
1:08:00 - How many silly rules are optimal?
1:11:30 - Why do you have separate weights for each agent?
1:13:45 - What features could be added next?
1:16:00 - How sensitive is the system to hyperparameters?
1:17:20 - How to avoid confirmation bias?
1:23:15 - How does this play into progress towards AGI?
1:29:30 - Can we make real-world recommendations based on this?
1:32:50 - Where do we go from here?

Paper: https://www.pnas.org/doi/10.1073/pnas...
Blog: https://deepmind.com/research/publica...

Abstract:
The fact that humans enforce and comply with norms is an important reason why humans enjoy higher levels of cooperation and welfare than other animals. Some norms are relatively easy to explain; they may prohibit obviously harmful or uncooperative actions. But many norms are not easy to explain. For example, most cultures prohibit eating certain kinds of foods and almost all societies have rules about what constitutes appropriate clothing, language, and gestures. Using a computational model focused on learning shows that apparently pointless rules can have an indirect effect on welfare. They can help agents learn how to enforce and comply with norms in general, improving the group’s ability to enforce norms that have a direct effect on welfare.

Authors: Raphael Köster, Dylan Hadfield-Menell, Richard Everett, Laura Weidinger, Gillian K. Hadfield, Joel Z. Leibo

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/2017636191

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Loading comments...