发表时间:2021
文章要点:这篇文章主要是设计了一个用来做experience replay的框架Reverb,主要是把experience replay扩展到了分布式和多台机器上(Reverb is designed to work efficiently in distributed configurations with up to thousands of concurrent clients.)。大概的思路就是把data generators (actors)和data consumers (learners)都搞到多台机器上了,然后数据的存储上做了一些压缩,同时检索的性能以及采样的性能都做了对应的实现。
总结:本来不打算看框架之类的,不过一看是deepmind发的,还是看看比较好。
疑问:里面设计了很多计算机方面的术语,看不大懂。
- Experience Framework Reverb Replay Forexperience framework reverb replay replay conservative estimation experience experience efficient tables replay optimization experience replay fundamentals revisiting experience replay experience remember forget replay prioritized experience sequence replay topological experience replay framework robot for reverb