Welcome to Tianshou!#
Tianshou (天授) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
DQNPolicy
Deep Q-NetworkDQNPolicy
Double DQNDQNPolicy
Dueling DQNBranchingDQNPolicy
Branching DQNC51Policy
Categorical DQNRainbowPolicy
Rainbow DQNQRDQNPolicy
Quantile Regression DQNIQNPolicy
Implicit Quantile NetworkFQFPolicy
Fully-parameterized Quantile FunctionPGPolicy
Policy GradientNPGPolicy
Natural Policy GradientA2CPolicy
Advantage Actor-CriticTRPOPolicy
Trust Region Policy OptimizationPPOPolicy
Proximal Policy OptimizationDDPGPolicy
Deep Deterministic Policy GradientTD3Policy
Twin Delayed DDPGSACPolicy
Soft Actor-CriticREDQPolicy
Randomized Ensembled Double Q-LearningDiscreteSACPolicy
Discrete Soft Actor-CriticImitationPolicy
Imitation LearningBCQPolicy
Batch-Constrained deep Q-LearningCQLPolicy
Conservative Q-LearningTD3BCPolicy
Twin Delayed DDPG with Behavior CloningDiscreteBCQPolicy
Discrete Batch-Constrained deep Q-LearningDiscreteCQLPolicy
Discrete Conservative Q-LearningDiscreteCRRPolicy
Critic Regularized RegressionGAILPolicy
Generative Adversarial Imitation LearningPSRLPolicy
Posterior Sampling Reinforcement LearningICMPolicy
Intrinsic Curiosity ModulePrioritizedReplayBuffer
Prioritized Experience Replaycompute_episodic_return()
Generalized Advantage EstimatorHERReplayBuffer
Hindsight Experience Replay
Here is Tianshou’s other features:
Elegant framework, using only ~3000 lines of code
State-of-the-art MuJoCo benchmark
Support vectorized environment (synchronous or asynchronous) for all algorithms: Parallel Sampling
Support super-fast vectorized environment EnvPool for all algorithms: EnvPool Integration
Support recurrent state representation in actor network and critic network (RNN-style training for POMDP): RNN-style Training
Support any type of environment state/action (e.g. a dict, a self-defined class, …): User-defined Environment and Different State Representation
Support Customize Training Process
Support n-step returns estimation
compute_nstep_return()
and prioritized experience replayPrioritizedReplayBuffer
for all Q-learning based algorithms; GAE, nstep and PER are very fast thanks to numba jit function and vectorized numpy operationSupport both TensorBoard and W&B log tools
Support multi-GPU training Multi-GPU Training
Comprehensive unit tests, including functional checking, RL pipeline checking, documentation checking, PEP8 code-style checking, and type checking
Installation#
Tianshou is currently hosted on PyPI and conda-forge. New releases (and the current state of the master branch) will require Python >= 3.11.
You can simply install Tianshou from PyPI with the following command:
$ pip install tianshou
If you use Anaconda or Miniconda, you can install Tianshou from conda-forge through the following command:
$ conda install tianshou -c conda-forge
You can also install with the newest version through GitHub:
$ pip install git+https://github.com/thu-ml/tianshou.git@master --upgrade
After installation, open your python console and type
import tianshou
print(tianshou.__version__)
If no error occurs, you have successfully installed Tianshou.
Tianshou is still under development, you can also check out the documents in stable version through tianshou.readthedocs.io/en/stable/.