Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
tyx_neu 4933cd6d3e | 1 year ago | |
---|---|---|
nest | 1 year ago | |
src | 1 year ago | |
tests | 1 year ago | |
torchbeast | 1 year ago | |
.clang-format | 1 year ago | |
.flake8 | 1 year ago | |
.gitignore | 1 year ago | |
.gitmodules | 1 year ago | |
.pre-commit-config.yaml | 1 year ago | |
CMakeLists.txt | 1 year ago | |
CODE_OF_CONDUCT.md | 1 year ago | |
CONTRIBUTING.md | 1 year ago | |
Dockerfile | 1 year ago | |
LICENSE | 1 year ago | |
README.md | 1 year ago | |
plot.png | 1 year ago | |
pyproject.toml | 1 year ago | |
requirements.txt | 1 year ago | |
setup.py | 1 year ago |
A PyTorch implementation of IMPALA: Scalable Distributed
Deep-RL with Importance Weighted Actor-Learner Architectures
by Espeholt, Soyer, Munos et al.
TorchBeast comes in two variants:
MonoBeast and
PolyBeast. While
PolyBeast is more powerful (e.g. allowing training across machines),
it's somewhat harder to install. MonoBeast requires only Python and
PyTorch (we suggest using PyTorch version 1.2 or newer).
For further details, see our paper.
@article{torchbeast2019,
title={{TorchBeast: A PyTorch Platform for Distributed RL}},
author={Heinrich K\"{u}ttler and Nantas Nardelli and Thibaut Lavril and Marco Selvatici and Viswanath Sivakumar and Tim Rockt\"{a}schel and Edward Grefenstette},
year={2019},
journal={arXiv preprint arXiv:1910.03552},
url={https://github.com/facebookresearch/torchbeast},
}
MonoBeast is a pure Python + PyTorch implementation of IMPALA.
To set it up, create a new conda environment and install MonoBeast's
requirements:
$ conda create -n torchbeast
$ conda activate torchbeast
$ conda install pytorch -c pytorch
$ pip install -r requirements.txt
Then run MonoBeast, e.g. on the Pong Atari
environment:
$ python -m torchbeast.monobeast --env PongNoFrameskip-v4
By default, MonoBeast uses only a few actors (each with their instance
of the environment). Let's change the default settings (try this on a
beefy machine!):
$ python -m torchbeast.monobeast \
--env PongNoFrameskip-v4 \
--num_actors 45 \
--total_steps 30000000 \
--learning_rate 0.0004 \
--epsilon 0.01 \
--entropy_cost 0.01 \
--batch_size 4 \
--unroll_length 80 \
--num_buffers 60 \
--num_threads 4 \
--xpid example
Results are logged to ~/logs/torchbeast/latest
and a checkpoint file is
written to ~/logs/torchbeast/latest/model.tar
.
Once training finished, we can test performance on a few episodes:
$ python -m torchbeast.monobeast \
--env PongNoFrameskip-v4 \
--mode test \
--xpid example
MonoBeast is a simple, single-machine version of IMPALA.
Each actor runs in a separate process with its dedicated instance of
the environment and runs the PyTorch model on the CPU to create
actions. The resulting rollout trajectories
(environment-agent interactions) are sent to the learner. In the main
process, the learner consumes these rollouts and uses them to update
the model's weights.
PolyBeast provides a faster and more scalable implementation of
IMPALA.
The easiest way to build and install all of PolyBeast's dependencies
and run it is to use Docker:
$ docker build -t torchbeast .
$ docker run --name torchbeast torchbeast
To run PolyBeast directly on Linux or MacOS, follow this guide.
Create a new Conda environment, and install PolyBeast's requirements:
$ conda create -n torchbeast python=3.7
$ conda activate torchbeast
$ pip install -r requirements.txt
Install PyTorch either from
source or as per its
website (select Conda).
PolyBeast also requires gRPC and other third-party software, which can
be installed by running:
$ git submodule update --init --recursive
Finally, let's compile the C++ parts of PolyBeast:
$ pip install nest/
$ python setup.py install
Create a new Conda environment, and install PolyBeast's requirements:
$ conda create -n torchbeast
$ conda activate torchbeast
$ pip install -r requirements.txt
PyTorch can be installed as per its
website (select Conda).
PolyBeast also requires gRPC and other third-party software, which can
be installed by running:
$ git submodule update --init --recursive
Finally, let's compile the C++ parts of PolyBeast:
$ pip install nest/
$ python setup.py install
To start both the environment servers and the learner process, run
$ python -m torchbeast.polybeast
The environment servers and the learner process can also be started separately:
python -m torchbeast.polybeast_env --num_servers 10
Start another terminal and run:
$ python3 -m torchbeast.polybeast_learner
|-----------------| |-----------------| |-----------------|
| ACTOR 1 | | ACTOR 2 | | ACTOR n |
|-------| | |-------| | |-------| |
| | .......| | | .......| . . . | | .......|
| Env |<-.Model.| | Env |<-.Model.| | Env |<-.Model.|
| |->.......| | |->.......| | |->.......|
|-----------------| |-----------------| |-----------------|
^ I ^ I ^ I
| I | I | I Actors
| I rollout | I rollout weights| I send
| I | I /--------/ I rollouts
| I weights| I | I (frames,
| I | I | I actions
| I | v | I etc)
| L=======>|--------------------------------------|<===========J
| |......... LEARNER |
\--------------|..Model.. Consumes rollouts, updates |
Learner |......... model weights |
sends |--------------------------------------|
weights
The system has two main components, actors and a learner.
Actors generate rollouts (tensors from a number of steps of
environment-agent interactions, including environment frames, agent
actions and policy logits, and other data).
The learner consumes that experience, computes a loss and updates the
weights. The new weights are then propagated to the actors.
We ran TorchBeast on Atari, using the same hyperparamaters and neural
network as in the IMPALA
paper. For comparison, we also ran
the open source TensorFlow implementation of
IMPALA, using the same
environment
preprocessing. The
results are equivalent; see our paper for details.
libtorchbeast
: C++ library that allows efficient learner-actor
communication via queueing and batching mechanisms. Some functions are
exported to Python using pybind11. For PolyBeast only.
nest
: C++ library that allows to manipulate complex
nested structures. Some functions are exported to Python using
pybind11.
tests
: Collection of python tests.
third_party
: Collection of third-party dependencies as Git
submodules. Includes gRPC.
torchbeast
: Contains monobeast.py
, and polybeast.py
,
polybeast_learner.py
and polybeast_env.py
.
Both MonoBeast and PolyBeast have flags and hyperparameters. To
describe a few of them:
num_actors
: The number of actors (and environment instances). Thebatch_size
: Determines the size of the learner inputs.unroll_length
: Length of a rollout (i.e., number of steps that an[unroll_length, batch_size, ...]
.We would love to have you contribute to TorchBeast or use it for your
research. See the CONTRIBUTING.md file for how to help
out.
TorchBeast is released under the Apache 2.0 license.
No Description
Python C++ Text Dockerfile Protocol Buffer other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》