大模型的旋转位置编码

发布时间 2023-12-19 10:48:37作者: 张博的博客
ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITIONEMBEDDING 论文
 
我们先看hf官网上给的说明:
  https://hf-mirror.com/docs/transformers/model_doc/roformer
 RoPE comes with valuable properties such as flexibility of being expand to any sequence lengths, decaying inter-token dependency with increasing relative distances, and capability of equipping the linear self-attention with relative position encoding.
特性: 位置编码可以拓展到任意长度, 内在token之间的依赖性跟他们的相对距离长度进行衰减, 
下面都是使用文档意思不大. 直接看论文.
 
$E=m\times c^2$