概
各种各样的 positional encoding.
符号说明
- \(U\), users;
- \(V\), items;
- \(V^u = [v_1^u, \ldots, v_k^u, \ldots, v_{|V^u|}^u]\), user 交互历史;
- \(T^u = [t_1^u, \ldots, t_k^u, \ldots, t_{|T^u|}^u]\), 交互所对应的 timestamp;
- \(N\), 实际使用的序列长度.
MEANTIME
Absolute
-
Day-embedding: 将 day 映射到 \(M^D \in R^{|D| \times h}\), 其中 \(|D|\) 是数据集中所有可能的 day.
-
Pos-embedding: 普通的 learnable 位置编码.
-
Con-embedding: 和 Pos-embedding 类似, 但是所有的位置共享一个向量, 说是为了去掉 positional bias (不是很理解).
Relative
-
令
\[d_{ij} = (\mathbf{t}_i - \mathbf{t}_j) / \tau \]表示两个位置的时间间隔, \(\tau\) 是人为调节的超参数.
-
\(E^{sin} \in \mathbb{R}^{N \times N \times h}\), 其中的每一个元素为:
\[e_{i, j, 2k} = \sin(d_{ij} / f^{2k / h}), \\ e_{i, j, 2k + 1} = \cos(d_{ij} / f^{2k / h}). \\ \]\(f\) 是可调节的参数.
-
\(E^{Exp} \in \mathbb{R}^{N \times N \times h}\) 的每个元素为:
\[e_{i, j, k} = \exp(-|d_{ij}| / f^{k / h}). \] -
\(E^{Log} \in \mathbb{R}^{N \times N \times h}\) 的每个元素为:
\[e_{i, j, k} = \log(1 + |d_{ij}| / f^{k / h}). \]
注入方式
- absolute encoding 按照 (c) 的方式, relative encoding 按照 (b) 的方式:
代码
[official]
- Multi-temporal Recommendation Embeddings Mechanisms Sequentialmulti-temporal recommendation embeddings mechanisms recommendation sequential diffusion diffurec recommendation autoencoder sequential masked recommendation self-attention sequential stochastic recommendation sequential augmented networks recommendation information sequential temporal recommendation convolutional personalized sequential contextualized recommendation sequential attention self-attention recommendation sequential attention recommendation time-aware sequential attention