https://arxiv.org/abs/1706.03762 Attention Is All You NeedThe dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a newarxiv.org Abstract 자연어처리의 혁신이라 불리고, 끝판왕이라 불리며 저에게 딥러닝에 관심을 갖게 해준 주인공을 드디어 정리하게 되었습니다...