alpha

alpha#

Source code: tianshou/highlevel/params/alpha.py

class AutoAlphaFactory[source]#

Bases: ToStringMixin, ABC

abstract create_auto_alpha(envs: Environments, device: str | device) → Alpha[source]#

class AutoAlphaFactoryDefault(lr: float = 0.0003, target_entropy_coefficient: float = -1.0, log_alpha: float = 0.0, optim: OptimizerFactoryFactory | None = None)[source]#

Bases: AutoAlphaFactory

Parameters:

lr – the learning rate for the optimizer of the alpha parameter
target_entropy_coefficient – the coefficient with which to multiply the target entropy; The base value being scaled is dim(A) for continuous action spaces and log(|A|) for discrete action spaces, i.e. with the default coefficient -1, we obtain -dim(A) and -log(dim(A)) for continuous and discrete action spaces respectively, which gives a reasonable trade-off between exploration and exploitation. For decidedly stochastic exploration, you can use a positive value closer to 1 (e.g. 0.98); 1.0 would give full entropy exploration.
log_alpha – the (initial) value of the log of the entropy regularization coefficient alpha.
optim – the optimizer factory to use; if None, use default

create_auto_alpha(envs: Environments, device: str | device) → AutoAlpha[source]#

alpha

Contents

alpha#