Welcome to Tianshou!

Welcome to Tianshou!#

Tianshou (天授) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:

DQNPolicy Deep Q-Network
DQNPolicy Double DQN
DQNPolicy Dueling DQN
BranchingDQNPolicy Branching DQN
C51Policy Categorical DQN
RainbowPolicy Rainbow DQN
QRDQNPolicy Quantile Regression DQN
IQNPolicy Implicit Quantile Network
FQFPolicy Fully-parameterized Quantile Function
PGPolicy Policy Gradient
NPGPolicy Natural Policy Gradient
A2CPolicy Advantage Actor-Critic
TRPOPolicy Trust Region Policy Optimization
PPOPolicy Proximal Policy Optimization
DDPGPolicy Deep Deterministic Policy Gradient
TD3Policy Twin Delayed DDPG
SACPolicy Soft Actor-Critic
REDQPolicy Randomized Ensembled Double Q-Learning
DiscreteSACPolicy Discrete Soft Actor-Critic
ImitationPolicy Imitation Learning
BCQPolicy Batch-Constrained deep Q-Learning
CQLPolicy Conservative Q-Learning
TD3BCPolicy Twin Delayed DDPG with Behavior Cloning
DiscreteBCQPolicy Discrete Batch-Constrained deep Q-Learning
DiscreteCQLPolicy Discrete Conservative Q-Learning
DiscreteCRRPolicy Critic Regularized Regression
GAILPolicy Generative Adversarial Imitation Learning
PSRLPolicy Posterior Sampling Reinforcement Learning
ICMPolicy Intrinsic Curiosity Module
PrioritizedReplayBuffer Prioritized Experience Replay
compute_episodic_return() Generalized Advantage Estimator
HERReplayBuffer Hindsight Experience Replay

Here is Tianshou’s other features:

Elegant framework, using only ~3000 lines of code
State-of-the-art MuJoCo benchmark
Support vectorized environment (synchronous or asynchronous) for all algorithms: Parallel Sampling
Support super-fast vectorized environment EnvPool for all algorithms: EnvPool Integration
Support recurrent state representation in actor network and critic network (RNN-style training for POMDP): RNN-style Training
Support any type of environment state/action (e.g. a dict, a self-defined class, …): User-defined Environment and Different State Representation
Support Customize Training Process
Support n-step returns estimation compute_nstep_return() and prioritized experience replay PrioritizedReplayBuffer for all Q-learning based algorithms; GAE, nstep and PER are very fast thanks to numba jit function and vectorized numpy operation
Support RL against random policy opponent with PettingZoo
Support both TensorBoard and W&B log tools
Support multi-GPU training Multi-GPU Training
Comprehensive unit tests, including functional checking, RL pipeline checking, documentation checking, PEP8 code-style checking, and type checking

中文文档位于 https://tianshou.readthedocs.io/zh/master/

Installation#

Tianshou is currently hosted on PyPI and conda-forge. New releases (and the current state of the master branch) will require Python >= 3.11.

You can simply install Tianshou from PyPI with the following command:

$ pip install tianshou

If you use Anaconda or Miniconda, you can install Tianshou from conda-forge through the following command:

$ conda install tianshou -c conda-forge

You can also install with the newest version through GitHub:

$ pip install git+https://github.com/thu-ml/tianshou.git@master --upgrade

After installation, open your python console and type

import tianshou
print(tianshou.__version__)

If no error occurs, you have successfully installed Tianshou.

Tianshou is still under development, you can also check out the documents in stable version through tianshou.readthedocs.io/en/stable/.

Welcome to Tianshou!

Contents

Welcome to Tianshou!#

Installation#

Indices and tables#