Transformer
Neural network architecture that has revolutionized AI, serving as the foundation for GPT, BERT, Claude, and Gemini.
The Transformer is an architecture introduced by Google in 2017 (paper 'Attention is All You Need'). It uses a self-attention mechanism to process sequences in parallel, overcoming the limitations of RNN and LSTM. It is the foundation of all modern LLM models and many vision models.
Practical examples
- GPT (OpenAI)
- BERT (Google)
- Claude (Anthropic)
- Vision Transformer (ViT)