Replay

Experience Replay with Likelihood-free Importance Weights

![](https://img2023.cnblogs.com/blog/1428973/202308/1428973-20230813231501149-700899538.png) **发表时间：**2020 **文章要点：**这篇文章提出LFIW算法用likelihood作为experienc ......

Likelihood-free Experience Likelihood Importance Weights更新时间 2023-08-13

Experience Replay Optimization

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230731085957589-2046683860.png) **发表时间：**2019 (IJCAI 2019) **文章要点：**这篇文章提出experience rep ......

Optimization Experience Replay更新时间 2023-07-31

The importance of experience replay database composition in deep reinforcement learning

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230727110633815-1407402877.png) **发表时间：**2015（Deep Reinforcement Learning Workshop, NIPS ......

reinforcement composition importance experience database更新时间 2023-07-27

Selective Experience Replay for Lifelong Learning

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230725234343269-1373726308.png) **发表时间：**2018（AAAI 2018） **文章要点：**这篇文章想解决强化学习在学多个任务时候的遗忘 ......

Experience Selective Lifelong Learning Replay更新时间 2023-07-25

Reverb: A Framework For Experience Replay

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230717102339025-699657308.png) **发表时间：**2021 **文章要点：**这篇文章主要是设计了一个用来做experience replay的框 ......

Experience Framework Reverb Replay For更新时间 2023-07-17

TOPOLOGICAL EXPERIENCE REPLAY

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230713232535617-402383287.png) **发表时间：**2022（ICLR 2022） **文章要点：**这篇文章指出根据TD error来采样是低效的 ......

TOPOLOGICAL EXPERIENCE REPLAY更新时间 2023-07-13

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

**发表时间：**2021 (NeurIPS 2021) **文章要点：**理论表明，更高的hindsight TD error，更加on policy,以及更准的target Q value的样本应该有更高的采样权重（The theory suggests that data with highe ......

Reinforcement Minimization Experience Off-Policy Learning更新时间 2023-07-10

MODEL-AUGMENTED PRIORITIZED EXPERIENCE REPLAY

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230703112126926-921811970.png) **发表时间：**2022（ICLR 2022） **文章要点：**这篇文章想说Q网络通常会存在under- or ......

MODEL-AUGMENTED PRIORITIZED EXPERIENCE AUGMENTED REPLAY更新时间 2023-07-03

Remember and Forget for Experience Replay

**发表时间：**2019（ICML 2019） **文章要点：**这篇文章想说如果replay的经验和当前的policy差别很大的话，对更新是有害的。然后提出了Remember and Forget Experience Replay (ReF-ER)算法，（1）跳过那些和当前policy差别很大 ......

Experience Remember Forget Replay and更新时间 2023-07-02

LEARNING TO SAMPLE WITH LOCAL AND GLOBAL CONTEXTS FROM EXPERIENCE REPLAY BUFFERS

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230625114456465-1558069206.png) **发表时间：**2021（ICLR 2021） **文章要点：**这篇文章想说，之前的experience r ......

EXPERIENCE LEARNING CONTEXTS BUFFERS GLOBAL更新时间 2023-06-25

Prioritized Sequence Experience Replay

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230623122845476-1483728572.png) **发表时间：**2020 **文章要点：**这篇文章提出了Prioritized Sequence Exper ......

Prioritized Experience Sequence Replay更新时间 2023-06-23

Revisiting Fundamentals of Experience Replay

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230609121441155-1445259850.png) **发表时间：**2020（ICML2020） **文章要点：**这篇文章研究了experience repla ......

Fundamentals Revisiting Experience Replay of更新时间 2023-06-09

Revisiting Prioritized Experience Replay: A Value Perspective

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230604130820622-309698896.png) **发表时间：**2021 **文章要点：**这篇文章想说Prioritized experience repla ......

Prioritized Perspective Revisiting Experience Replay更新时间 2023-06-04

Apr 2021-Lucid Dreaming for Experience Replay: Refreshing Past States with the Current Policy

本文提出了用于经验回放的清醒梦(LiDER)，一个概念上的新框架，允许通过利用智能体的当前策略来刷新回放体验。 ......

Experience Refreshing Dreaming Current Replay更新时间 2023-06-04

Feb 2023-Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

将 replay memory视为经验 replay memory MDP (RM-MDP)，并通过求解该经验MDP获得一个保守估计。MDP是非平稳的，可以通过采样有效地更新。基于保守估计设计了价值和策略正则化器，并将其与经验回放(CEER)相结合来正则化DQN的学习。 ......

Replay Conservative Estimation Experience Empirical更新时间 2023-05-23

May 2022-Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks

提出了邻域混合经验回放(NMER)，一种基于几何的回放缓冲区，用状态-动作空间中最近邻的transition进行插值。NMER仅通过混合transition与邻近状态-动作特征来保持trnaistion流形的局部线性近似。 ......

Interpolation Neighborhood Experience Continuous Efficiency更新时间 2023-05-20

共20篇 :1/1页 首页上一页1下一页尾页

526互联