playing atari with deep reinforcement learning NIPS Deep Learning Workshop 2013. paper
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller
Human-level control through deep reinforcement learning Nature 2015. paper
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg & Demis Hassabis
Deep Reinforcement Learning with Double Q-learning AAAI 16. paper
Hado van Hasselt, Arthur Guez, David Silver
Dueling Network Architectures for Deep Reinforcement Learning ICML16. paper
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas
Deep Recurrent Q-Learning for Partially Observable MDPs AAA15. paper
Matthew Hausknecht, Peter Stone
Prioritized Experience Replay ICLR 2016. paper
Tom Schaul, John Quan, Ioannis Antonoglou, David Silver
Asynchronous Methods for Deep Reinforcement Learning ICML2016. paper
Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
A Distributional Perspective on Reinforcement Learning ICML2017. paper
Marc G. Bellemare, Will Dabney, Rémi Munos
Noisy Networks for Exploration ICLR2018. paper
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg
Rainbow: Combining Improvements in Deep Reinforcement Learning AAAI2018. paper
Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver
Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion NIPS2018. paper
Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning ICML2018.paper
Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine
Value Prediction Network NIPS2017. paper
Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine
Imagination-Augmented Agents for Deep Reinforcement Learning NIPS2017. paper
Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
Continuous Deep Q-Learning with Model-based Acceleration ICML2016. paper
Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning CoRL2017. paper
Gabriel Kalweit, Joschka Boedecker
Model-Ensemble Trust-Region Policy Optimization ICLR2018. paper
Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models NIPS2018. paper
Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine
Dyna, an integrated architecture for learning, planning, and reacting ACM1991. paper
Sutton, Richard S
Learning Continuous Control Policies by Stochastic Value Gradients NIPS 2015. paper
Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa, Tom Erez
Imagination-Augmented Agents for Deep Reinforcement Learning NIPS 2017. paper
Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks ICLR 2017. paper
Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft
Asynchronous Methods for Deep Reinforcement Learning ICML 2016. paper
Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
GA3C: GPU-based A3C for Deep Reinforcement Learning NIPS 2016. paper
Iuri Frosio, Stephen Tyree Jason Clemons Jan Kautz
Distributed Prioritized Experience Replay ICLR 2018. paper
Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures ICML 2018. paper
Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu
Distributed Distributional Deterministic Policy Gradients ICLR 2018. paper
Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy Lillicrap
Emergence of Locomotion Behaviours in Rich Environments arXiv. paper
Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin Riedmiller, David Silver
Recurrent Experience Replay in Distributed Reinforcement Learning ICLR 2019. paper
Steven Kapturowski, Georg Ostrovski, John Quan, Remi Munos, Will Dabney
GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning CoRL 2018. paper
Jacky Liang, Viktor Makoviychuk, Ankur Handa, Nuttapong Chentanez, Miles Macklin, Dieter Fox
SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark CoRL 2018. paper
Linxi Fan, Yuke Zhu, Jiren Zhu, Zihua Liu, Orien Zeng, Anchit Gupta, Joan Creus-Costa, Silvio Savarese, Li Fei-Fei
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》