Over the past few years, Reinforcement learning (RL) research has seen a number of significant advances and the type of progress it has made turn out to be very important, as the algorithms that yield these advances are additionally applicable for other domains, such as in robotics. Very often, developing these kinds of advances requires the iteration over a design quickly, and it is done so often with no clear direction — and disrupting the structure of established methods. However, most existing RL frameworks do not provide the combination of flexibility as well as stability that effectively enables researchers to iterate on RL methods, and thus explore new research directions that may not have obvious benefits immediately. Further, from existing frameworks reproducing the results is often time-consuming, which can lead to scientific reproducibility issues down the line. Therefore, you are presented to a new Tensorflow-based framework: Dopamine, A Research Framework For Fast Prototyping Of Reinforcement Learning Algorithms that aims to provide flexibility, stability, and reproducibility for new as well as experienced RL researchers alike. What are the principles that Dopamine is based on? Dopamine, having had its inspiration from one of the main components in reward-motivated behaviour in the brain and reflecting the strong historical connection between neuroscience and reinforcement learning research, aims to enable the kind of speculative research that can drive radical discoveries. And in order to do so, the framework was designed keeping the below into consideration: 1- Ease Of Use: The two key considerations in the design of this framework are Clarity and simplicity. The code that is provided is compact as well as well-documented. This is achieved by focusing on a mature, well-understood benchmark: Arcade Learning Environment and four value-based agents:
- A simplified carefully curated variant of the Rainbow agent,
- The Implicit Quantile Network agent, which was presented at the International Conference on Machine Learning (ICML).
- Easy experimentation: Make it easier for new users to run benchmark experiments.
- Flexible development: Make it easier for new users to try out research ideas.
- Compact and reliable: Provision of implementations for a few, battle-tested algorithms.
- Reproducible: Facilitate reproducibility in results.
More Information: GitHub