Sliding Window Attention

A technique used in transformer models to limit the attention span of each token to a fixed size window around it, reducing computational complexity and making the model more efficient.

Areas of application

Natural Language Processing
Speech Recognition
Image Processing
Time Series Analysis
Neural Machine Translation
Autonomous Vehicles
Healthcare Analytics

Example

For example, in a machine translation task, SWA can be used to focus only on the tokens within a certain distance from the current word being translated, rather than considering the entire input sequence.

Resources

Current state of data in your organization

← Sensor Fusion Software Engineering →