Attention is not all you need: huge inductive biases in self-attention based models
Mar 21, 2021
Interesting. I was getting tired of the “XXX is all you need” paper titles.
quote
Raymond de Lacaze shared a link to the group: Montreal.AI.
Attention Is Not All You Need: Google & EPFL Study Reveals Huge Inductive Biases in Self-Attention Architectures
“The 2017 paper Attention is All You Need introduced transformer architectures based on attention mechanisms, marking one of the biggest machine learning (ML) breakthroughs ever. A recent study proposes a new way to study self-attention, its biases, and the problem of rank collapse.
Attention-based architectures have proven effective for improving ML applications in natural language processing (NLP), speech recognition, and most recently in computer vision. Research aimed at understanding the inner workings of transformers and attention in general, however, has been limited.
In the paper Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth, a research team from Google and EPFL (École polytechnique fédérale de Lausanne) proposes a novel approach that sheds light on the operation and inductive biases of self-attention networks (SANs) and finds that pure attention decays in rank doubly exponentially with respect to depth.”
MEDIUM.COM
Attention Is Not All You Need: Google & EPFL Study Reveals Huge Inductive Biases in Self-Attention…
The 2017 paper Attention is All You Need introduced transfor…
← Back to all articles Quick Navigation: Next:[ j ] – Prev:[ k ] – List:[ l ]