Advantage-Weighted

off-policy RL | Advantage-Weighted Regression (AWR):组合先前策略得到新 base policy

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning 论文题目:Advantage-Weighted Regression: Simple and Scalable Off-Polic ......
共1篇  :1/1页 首页上一页1下一页尾页