Replay

Experience Replay with Likelihood-free Importance Weights

![](https://img2023.cnblogs.com/blog/1428973/202308/1428973-20230813231501149-700899538.png) **发表时间:**2020 **文章要点:**这篇文章提出LFIW算法用likelihood作为experienc ......

Experience Replay Optimization

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230731085957589-2046683860.png) **发表时间:**2019 (IJCAI 2019) **文章要点:**这篇文章提出experience rep ......
Optimization Experience Replay

The importance of experience replay database composition in deep reinforcement learning

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230727110633815-1407402877.png) **发表时间:**2015(Deep Reinforcement Learning Workshop, NIPS ......

Selective Experience Replay for Lifelong Learning

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230725234343269-1373726308.png) **发表时间:**2018(AAAI 2018) **文章要点:**这篇文章想解决强化学习在学多个任务时候的遗忘 ......

Reverb: A Framework For Experience Replay

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230717102339025-699657308.png) **发表时间:**2021 **文章要点:**这篇文章主要是设计了一个用来做experience replay的框 ......
Experience Framework Reverb Replay For

TOPOLOGICAL EXPERIENCE REPLAY

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230713232535617-402383287.png) **发表时间:**2022(ICLR 2022) **文章要点:**这篇文章指出根据TD error来采样是低效的 ......
TOPOLOGICAL EXPERIENCE REPLAY

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

**发表时间:**2021 (NeurIPS 2021) **文章要点:**理论表明,更高的hindsight TD error,更加on policy,以及更准的target Q value的样本应该有更高的采样权重(The theory suggests that data with highe ......

MODEL-AUGMENTED PRIORITIZED EXPERIENCE REPLAY

![](https://img2023.cnblogs.com/blog/1428973/202307/1428973-20230703112126926-921811970.png) **发表时间:**2022(ICLR 2022) **文章要点:**这篇文章想说Q网络通常会存在under- or ......

Remember and Forget for Experience Replay

**发表时间:**2019(ICML 2019) **文章要点:**这篇文章想说如果replay的经验和当前的policy差别很大的话,对更新是有害的。然后提出了Remember and Forget Experience Replay (ReF-ER)算法,(1)跳过那些和当前policy差别很大 ......
Experience Remember Forget Replay and

LEARNING TO SAMPLE WITH LOCAL AND GLOBAL CONTEXTS FROM EXPERIENCE REPLAY BUFFERS

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230625114456465-1558069206.png) **发表时间:**2021(ICLR 2021) **文章要点:**这篇文章想说,之前的experience r ......
EXPERIENCE LEARNING CONTEXTS BUFFERS GLOBAL

Prioritized Sequence Experience Replay

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230623122845476-1483728572.png) **发表时间:**2020 **文章要点:**这篇文章提出了Prioritized Sequence Exper ......
Prioritized Experience Sequence Replay

Revisiting Fundamentals of Experience Replay

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230609121441155-1445259850.png) **发表时间:**2020(ICML2020) **文章要点:**这篇文章研究了experience repla ......
Fundamentals Revisiting Experience Replay of

Revisiting Prioritized Experience Replay: A Value Perspective

![](https://img2023.cnblogs.com/blog/1428973/202306/1428973-20230604130820622-309698896.png) **发表时间:**2021 **文章要点:**这篇文章想说Prioritized experience repla ......

Apr 2021-Lucid Dreaming for Experience Replay: Refreshing Past States with the Current Policy

本文提出了用于经验回放的清醒梦(LiDER),一个概念上的新框架,允许通过利用智能体的当前策略来刷新回放体验。 ......

Feb 2023-Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

将 replay memory视为经验 replay memory MDP (RM-MDP),并通过求解该经验MDP获得一个保守估计。MDP是非平稳的,可以通过采样有效地更新。基于保守估计设计了价值和策略正则化器,并将其与经验回放(CEER)相结合来正则化DQN的学习。 ......

May 2022-Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks

提出了邻域混合经验回放(NMER),一种基于几何的回放缓冲区,用状态-动作空间中最近邻的transition进行插值。NMER仅通过混合transition与邻近状态-动作特征来保持trnaistion流形的局部线性近似。 ......

APRIL 2022-Explanation-Aware Experience Replay in Rule-Dense Environments

#I. INTRODUCTION 解释是人类智能的关键机制,这种机制有可能提高RL代理在复杂环境中的表现 实现这一目标的一个核心设计挑战是将解释集成到计算表示中。即使在最小的规则集变化下,将规则集(或部分规则集)编码到智能体的观察空间等方法也可能导致严重的重新训练开销,因为规则的语义被明确地作为输入 ......

Event Tables for Efficient Experience Replay

#Abstract 事件表分层抽样(SSET),它将ER缓冲区划分为事件表,每个事件表捕获最优行为的重要子序列。 我们证明了一种优于传统单片缓冲方法的理论优势,并将SSET与现有的优先采样策略相结合,以进一步提高学习速度和稳定性。 在具有挑战性的MiniGrid域、基准RL环境和高保真赛车模拟器中的 ......
Experience Efficient Tables Replay Event

MEMORY REPLAY WITH DATA COMPRESSION FOR CONTINUAL LEARNING--阅读笔记

MEMORY REPLAY WITH DATA COMPRESSION FOR CONTINUAL LEARNING--阅读笔记 摘要: 在这项工作中,我们提出了使用数据压缩(MRDC)的内存重放,以降低旧的训练样本的存储成本,从而增加它们可以存储在内存缓冲区中的数量。观察到压缩数据的质量和数量之间 ......
COMPRESSION CONTINUAL LEARNING 笔记 MEMORY

GCR Gradient Coreset based Replay Buffer Selection for Continual Learning

GCR: Gradient Coreset based Replay Buffer Selection for Continual Learning 摘要:本文提出了一种创新的重放缓冲区选择和更新策略,梯度核心集重放(GCR),使用一种设计优化标准。 该方法选择和维持一个“coreset” ,它非常 ......
共20篇  :1/1页 首页上一页1下一页尾页