3 years agoDeBERTa: Decoding-enhanced BERT with Disentangled Attention (Machine Learning Paper Explained)ykilcher
3 years agoSparse is Enough in Scaling Transformers (aka Terraformer) | ML Research Paper Explainedykilcher
3 years agoPretrained Transformers as Universal Computation Engines (Machine Learning Research Paper Explained)ykilcher
3 years agoFastformer: Additive Attention Can Be All You Need (Machine Learning Research Paper Explained)ykilcher
2 years agoHyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning (w/ Author)ykilcher
3 years agoDecision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)ykilcher
3 years agoALiBi - Train Short, Test Long: Attention with linear biases enables input length extrapolationykilcher
1 year ago🐐 OpenAI Tutorial - Learn Text Completion with OpenAI, ChatGPT, Next.Js, React & TailwindCSSthecodinggoat