Today is the sixth anniversary of the publication of the Transformer paper "Attention is All You Need"!
Interesting facts:
⭐️ The Transformer did not invent the attention mechanism, but it took it to the extreme. The first paper on attention mechanisms was published three years ago (2014) with a less catchy title: "Neural Machine Translation by Jointly Learning to Align and Translate" from Yoshua Bengio's lab. This paper combined RNNs and "context vectors" (i.e., attention). Many people may not have heard of this paper, but it is one of the most important milestones in the field of natural language processing and has been cited 29,000 times (compared to Transformer's 77,000 citations).
⭐️ Neither the Transformer nor the original attention paper discussed universal sequence computers. Instead, they were both conceived to solve a narrow and specific problem: machine translation. Surprisingly, AGI (Artificial General Intelligence) can be traced back to the humble Google Translate.
⭐️ The Transformer was published at the top global AI conference NeurIPS in 2017. However, it didn't even receive an oral presentation, let alone an award. That year's NeurIPS had three best papers, which have been cited a total of 529 times.
Source:
https://twitter.com/drjimfan/status/1668287791200108544?s=46&t=J5tuuFL7Z3qsWetu4lBIXg