ChatGPT - Explained!

2 years ago
4

We are going to talk about the internal workings of ChatGPT and the fundamental concepts it lies on: Language Models, Transformer Neural Networks, GPT models and Reinforcement Learning

RESOURCES
[1] ChatGPT blog: https://openai.com/blog/chatgpt/
[2] Instruct GPT which is the model ChatGPT was modeled after: https://arxiv.org/pdf/2203.02155.pdf
[3] Proximal Policy Optimization is how ChatGPT makes use of human rankings to update model parameters and make it more "safe" and "truthful": https://openai.com/blog/openai-baseli...
[4] Here is a paper that shows how Reinforcement learning through human feedback actually helps: https://arxiv.org/pdf/2009.01325.pdf
[5] Every timestep, a subword token is generated. Here is some more information on this process with BPE: https://towardsdatascience.com/byte-p...
[6] Basic Concepts in Reinforcement Learning: https://www.baeldung.com/cs/ml-policy...
[7] Why Does GPT-3 write non-sensical stuff that sounds legit? https://www.alignmentforum.org/posts/...
[8] What is GPT-3.5? https://beta.openai.com/docs/model-in...

Loading comments...