Kubnal Bridge

Techniques & Methods

Attention Mechanism

The attention mechanism computes a weighted sum of input representations, where weights reflect how relevant each input position is to producing the current output. Originally introduced to improve machine translation (allowing the decoder to focus on relevant source words), it was later generalized to self-attention in transformers.

Attention enables transformers to capture long-range dependencies that RNNs struggled with, making them dramatically more effective for long documents, complex reasoning, and multilingual tasks.

Authority Links

Related Terms