郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布!
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5331-5340, 2019
- Meta-Reinforcement Reinforcement Probabilistic Off-Policy Efficientmeta-reinforcement reinforcement probabilistic meta-reinforcement meta-reinforcement reinforcement exploration reinforcement minimization experience off-policy reinforcement exploration off-policy learning probabilistic efficient framework embraces memory-efficient consolidation reinforcement off-policy probabilistic off-policy alphazero targets greedy