526互联

self-attention

发布时间 2023-07-19 22:00:35作者: AKA铁柱

Self attention考虑了整个sequence的资讯【transfermer中重要的架构是self-attention】

解决sequence2sequence的问题，考虑前后文

I saw a saw 第一个saw对应输出动词第二个输出名词

如何计算相关性【attention score】

输入的两个向量乘两个矩阵

Q=query k=key

看a1，自己和自己也计算了关联性。【a1，1】

只需要学习三个矩阵的参数

self-attention attention self

self-attentive interaction attentive automatic

self-attention representation functional attention

recommendation self-attention sequential stochastic

self-attention local-global interactions transformers

self-attention attention笔记self

self-attention attention self 4.1

self-attention注意力attention机制