[1] 带权滑动平均(Weighted Moving Average, WMA) 是标量场上的滑动窗口内的加权平均,数学上等价于卷积。[1]
[2] Kernel Smoother 是一种特殊的 WMA 方法,特殊在于权重是由核函数决定的,相互之间越接近的点具有越高的权重。[2]
[3] Transformer 中的自注意力机制可以看作一种 Kernel Smoother。[3] 其灵活性来自于核函数对距离的编码的灵活性,可以以新的视角重新组合原始的特征。
- Attention Smoother Kernel Selfattention smoother kernel self self-attention self-attention attention self recommendation self-attention sequential stochastic self-attention attention self 4.1 self-attention attention笔记self self-attention representation functional attention self-attention local-global interactions transformers self-attention注意力attention机制 self-attention modifications predicting expression