wuzb1005/PARL: PARL 是一个高性能、灵活的强化学习框架 - PARL - OpenI

关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

History

rical730 f5a0d0d0d1 use environment variable `PARL_BACKEND` to specify backend framework (#531 ) * use `PARL_BACKEND` to specify framwork * add `os.environ['PARL_BACKEND'] = 'fluid'` to all fluid example * yapf * allow running train_with_xpu.py and add check_version_for_fluid() * add check_version_for_fluid() to all fluid example * fix bugs of import os * yapf * yapf * Edit comment		3 years ago
..
README.md	paddle version upgrade to 1.8.5, add python3.8 unit test. (#496)	3 years ago

atari.py	state to obs (#256)	4 years ago

atari_agent.py	add offline q learning (#193)	4 years ago

atari_model.py	add offline q learning (#193)	4 years ago

atari_wrapper.py	state to obs (#256)	4 years ago

dqn.py	remove version 1.3 warnings (#252)	4 years ago

parallel_run.py	use environment variable `PARL_BACKEND` to specify backend framework (#531)	3 years ago

replay_memory.py	state to obs (#256)	4 years ago

rom_files	state to obs (#256)	4 years ago

utils.py	state to obs (#256)	4 years ago

Parallel Training with PARL

Use parl.compile to train the model parallelly. When applying offline training or dataset is too large to train on a single GPU, we can use parallel computing to accelerate training.

# Set CUDA_VISIBLE_DEVICES to select which GPUs to train 

import parl
import paddle.fluid as fluid

learn_program = fluid.Program()
with fluid.program_guard(learn_program):
    # Define your learn program and training loss
    pass

learn_program = parl.compile(learn_program, loss=training_loss)  
# Pass the training loss to parl.compile. Distribute the model and data to GPUs.

Demonstration

We provide a demonstration of offline Q-learning with parallel executing, in which we seperate the procedures of collecting data and training the model. First we collect data by interacting with the environment and save them to a replay memory file, and then fit and evaluate the Q network with the collected data. Repeat these two steps to improve the performance gradually.

Dependencies:

paddlepaddle>=1.8.5
parl
gym
tqdm
atari-py

How to Run:

# Collect training data
python parallel_run.py --rom rom_files/pong.bin

# Train the model offline with multi-GPU
python parallel_run.py --rom rom_files/pong.bin --train

PARL 是一个高性能、灵活的强化学习框架

ai开发工具

Python C++ JavaScript Markdown Shell other

2466956298@qq.com zenghongsheng@baidu.com 39279048+Banmahhhh@users.noreply.github.com likejiao@baidu.com zhoubo01@baidu.com 76139596+ShuaibinLi@users.noreply.github.com 52879090+YuechengLiu@users.noreply.github.com royxroy@163.com zenghsh3@gmail.com tan_ze@outlook.com 52879090+liuyuecheng-github@users.noreply.github.com haonanyu@baidu.com cclauss@me.com yu239@users.noreply.github.com tangzhiyi11@users.noreply.github.com bestwanglei@gmail.com skylian@users.noreply.github.com emailweixu@gmail.com wangzelong0663@gmail.com wyattliang@gmail.com bnujli@gmail.com yhan_shen@163.com 40143136+Ynjxsjmh@users.noreply.github.com 41483463+goshawk22@users.noreply.github.com alexqdh@foxmail.com

How to access data resources in code

README.md

Parallel Training with PARL

Demonstration

Dependencies:

How to Run:

Contributors (25+) All

Contributors (25+)
All