Premium Only Content
RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)
#gpt4 #rwkv #transformer
We take a look at RWKV, a highly scalable architecture between Transformers and RNNs.
Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynnc
OUTLINE:
0:00 - Introduction
1:50 - Fully Connected In-Person Conference in SF June 7th
3:00 - Transformers vs RNNs
8:00 - RWKV: Best of both worlds
12:30 - LSTMs
17:15 - Evolution of RWKV's Linear Attention
30:40 - RWKV's Layer Structure
49:15 - Time-Parallel vs Sequence Mode
53:55 - Experimental Results & Limitations
58:00 - Visualizations
1:01:40 - Conclusion
Paper: https://arxiv.org/abs/2305.13048
Code: https://github.com/BlinkDL/RWKV-LM
Abstract:
Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scaling in memory and computational requirements but struggle to match the same performance as Transformers due to limitations in parallelization and scalability. We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs. Our approach leverages a linear attention mechanism and allows us to formulate the model as either a Transformer or an RNN, which parallelizes computations during training and maintains constant computational and memory complexity during inference, leading to the first non-transformer architecture to be scaled to tens of billions of parameters. Our experiments reveal that RWKV performs on par with similarly sized Transformers, suggesting that future work can leverage this architecture to create more efficient models. This work presents a significant step towards reconciling the trade-offs between computational efficiency and model performance in sequence processing tasks.
Authors: Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Xiangru Tang, Bolun Wang, Johan S. Wind, Stansilaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
-
8:01
MattMorseTV
8 hours agoTrump just GUTTED the ENTIRE SYSTEM.
4.3K33 -
20:02
Nikko Ortiz
9 hours agoBlades And Sorcery Is The Ultimate Medieval Fantasy
8921 -
2:12:18
Side Scrollers Podcast
17 hours agoSide Scrollers VTuber TAKE OVER with Kirsche, Rev Says Desu & DarlingStrawb | Side Scrollers
76.1K10 -
29:15
BlabberingCollector
1 day agoHarry Potter X Fortnite, Fans Reee Over Trans Rights, NEW Audiobooks Are OUT, Wizarding Quick Hits
97 -
1:20:42
The Connect: With Johnny Mitchell
5 days ago $0.26 earnedThe Truth Behind The U.S. Invasion Of Venezuela: Ed Calderon Exposes American Regime Change Secrets
7902 -
LIVE
Lofi Girl
3 years agolofi hip hop radio 📚 - beats to relax/study to
192 watching -
21:39
TruthStream with Joe and Scott
2 days agoJoe, Scott and Lewis, Censorship and the Nov 8th event in Carlsbad California!
1.99K1 -
22:47
The Pascal Show
1 day ago $0.15 earnedTHEY’RE HIDING EVIDENCE?! Candace Owens EXPOSES Foreign Connection In Charlie Kirk Shooting
30.1K34 -
7:44:50
SpartakusLIVE
8 hours agoThe Duke of Nuke CONQUERS Arc Raiders
144K1 -
1:05:26
Man in America
10 hours ago“Poseidon” Doomsday Sub, Microplastics & The War on Testosterone w/ Kim Bright
16.4K17