Skip to main content
Back to top
Ctrl
+
K
Welcome to Tianshou!
User Guide
The Reinforcement Learning Process
Dual APIs
Core Abstractions
Deep Dives
Batch: Tianshou’s Core Data Structure
Buffer: Experience Replay in Tianshou
Environments
Generalized Advantage Estimation
Collector
Multi-Agent Reinforcement Learning (MARL)
Tianshou API Reference
config
trainer
algorithm
algorithm_base
optim
random
imitation
bcq
cql
discrete_bcq
discrete_cql
discrete_crr
gail
imitation_base
td3_bc
modelbased
icm
psrl
modelfree
a2c
bdqn
c51
ddpg
discrete_sac
dqn
fqf
iqn
npg
ppo
qrdqn
rainbow
redq
reinforce
sac
td3
trpo
multiagent
marl
data
batch
collector
stats
types
buffer
buffer_base
cached
her
manager
prio
vecbuf
utils
converter
segtree
env
gym_wrappers
pettingzoo_env
utils
venv_wrappers
venvs
atari
atari_network
atari_wrapper
worker
dummy
ray
subproc
worker_base
evaluation
launcher
rliable_evaluation
exploration
random
highlevel
algorithm
config
env
experiment
logger
persistence
trainer
world
module
actor
core
critic
intermediate
special
params
algorithm_params
algorithm_wrapper
alpha
collector
dist_fn
env_param
lr_scheduler
noise
optim
utils
conversion
determinism
lagged_network
logging
print
progress_bar
space_info
statistics
torch_utils
warning
logger
logger_base
tensorboard
wandb
net
common
continuous
discrete
Benchmarks
Developer Guide
Contributors
.rst
.pdf
worker
worker
#
dummy
subproc
worker_base
ray