![The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/xlnet/transformer-decoder-intro.png)
The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.
![Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism – arXiv Vanity Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism – arXiv Vanity](https://media.arxiv-vanity.com/render-output/6472056/transformer_2.png)
Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism – arXiv Vanity
![NVIDIA Clocks World's Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Technical Blog NVIDIA Clocks World's Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Technical Blog](https://developer.nvidia.com/blog/wp-content/uploads/2019/08/Figure-3-Training.jpg)
NVIDIA Clocks World's Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Technical Blog
![The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/gpt2/gpt2-transformer-block-weights-2.png)
The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.
![OpenAI's GPT-2 (Generative Pre-Trained Transformer-2) : "AI that is too Dangerous to Handle" | Analytics Steps OpenAI's GPT-2 (Generative Pre-Trained Transformer-2) : "AI that is too Dangerous to Handle" | Analytics Steps](https://www.analyticssteps.com/backend/media/thumbnail/9428941/4997004_1579092390_9700112_1570345059_Banner_Image.jpg)
OpenAI's GPT-2 (Generative Pre-Trained Transformer-2) : "AI that is too Dangerous to Handle" | Analytics Steps
![Too powerful NLP model (GPT-2). What is Generative Pre-Training | by Edward Ma | Towards Data Science Too powerful NLP model (GPT-2). What is Generative Pre-Training | by Edward Ma | Towards Data Science](https://miro.medium.com/proxy/1*jbcwhhB8PEpJRk781rML_g.png)
Too powerful NLP model (GPT-2). What is Generative Pre-Training | by Edward Ma | Towards Data Science
![The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/gpt2/gpt-2-transformer-xl-bert-3.png)
The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.
![OpenAI's GPT-2 Explained | Visualizing Transformer Language Models | Generative Pre-Training | GPT 3 - YouTube OpenAI's GPT-2 Explained | Visualizing Transformer Language Models | Generative Pre-Training | GPT 3 - YouTube](https://i.ytimg.com/vi/XynJ-gM6aD0/maxresdefault.jpg)
OpenAI's GPT-2 Explained | Visualizing Transformer Language Models | Generative Pre-Training | GPT 3 - YouTube
![The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/gpt2/gpt2-self-attention-scoring-2.png)
The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.
![The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/gpt2/gpt2-self-attention-example-2.png)
The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.
OpenAI's GPT-2 (Generative Pre-Trained Transformer-2) : "AI that is too Dangerous to Handle" | Analytics Steps
![The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated GPT-2 (Visualizing Transformer Language Models) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/gpt2/openAI-GPT-2-3.png)