actor#
Source code: tianshou/highlevel/module/actor.py
- class ContinuousActorType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum- GAUSSIAN = 'gaussian'#
- DETERMINISTIC = 'deterministic'#
- UNSUPPORTED = 'unsupported'#
- class ActorFuture(actor: Actor | Module | None = None)[source]#
Bases:
objectContainer, which, in the future, will hold an actor instance.
- class ActorFutureProviderProtocol(*args, **kwargs)[source]#
Bases:
Protocol- get_actor_future() ActorFuture[source]#
- class ActorFactory[source]#
Bases:
ModuleFactory,ToStringMixin,ABC- abstract create_module(envs: Environments, device: str | device) Actor | Module[source]#
- abstract create_dist_fn(envs: Environments) Callable[[tuple[Tensor, Tensor]], Distribution] | Callable[[Tensor], Distribution] | None[source]#
- Parameters:
envs – the environments
- Returns:
the distribution function, which converts the actor’s output into a distribution, or None if the actor does not output distribution parameters
- class ActorFactoryDefault(continuous_actor_type: ~tianshou.highlevel.module.actor.ContinuousActorType, hidden_sizes: ~collections.abc.Sequence[int] = (64, 64), hidden_activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, continuous_unbounded: bool = False, continuous_conditioned_sigma: bool = False, discrete_softmax: bool = True)[source]#
Bases:
ActorFactoryAn actor factory which, depending on the type of environment, creates a suitable MLP-based policy.
- DEFAULT_HIDDEN_SIZES = (64, 64)#
- create_module(envs: Environments, device: str | device) Actor | Module[source]#
- create_dist_fn(envs: Environments) Callable[[tuple[Tensor, Tensor]], Distribution] | Callable[[Tensor], Distribution] | None[source]#
- Parameters:
envs – the environments
- Returns:
the distribution function, which converts the actor’s output into a distribution, or None if the actor does not output distribution parameters
- class ActorFactoryContinuous[source]#
Bases:
ActorFactory,ABCServes as a type bound for actor factories that are suitable for continuous action spaces.
- class ActorFactoryContinuousDeterministicNet(hidden_sizes: ~collections.abc.Sequence[int], activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#
Bases:
ActorFactoryContinuous- create_module(envs: Environments, device: str | device) Actor[source]#
- create_dist_fn(envs: Environments) Callable[[tuple[Tensor, Tensor]], Distribution] | Callable[[Tensor], Distribution] | None[source]#
- Parameters:
envs – the environments
- Returns:
the distribution function, which converts the actor’s output into a distribution, or None if the actor does not output distribution parameters
- class ActorFactoryContinuousGaussianNet(hidden_sizes: ~collections.abc.Sequence[int], unbounded: bool = True, conditioned_sigma: bool = False, activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#
Bases:
ActorFactoryContinuousFor actors with Gaussian policies.
- Parameters:
hidden_sizes – the sequence of hidden dimensions to use in the network structure
unbounded – whether to apply tanh activation on final logits
conditioned_sigma – if True, the standard deviation of continuous actions (sigma) is computed from the input; if False, sigma is an independent parameter
- create_module(envs: Environments, device: str | device) Actor[source]#
- create_dist_fn(envs: Environments) Callable[[tuple[Tensor, Tensor]], Distribution] | Callable[[Tensor], Distribution] | None[source]#
- Parameters:
envs – the environments
- Returns:
the distribution function, which converts the actor’s output into a distribution, or None if the actor does not output distribution parameters
- class ActorFactoryDiscreteNet(hidden_sizes: ~collections.abc.Sequence[int], softmax_output: bool = True, activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#
Bases:
ActorFactory- create_module(envs: Environments, device: str | device) Actor[source]#
- create_dist_fn(envs: Environments) Callable[[tuple[Tensor, Tensor]], Distribution] | Callable[[Tensor], Distribution] | None[source]#
- Parameters:
envs – the environments
- Returns:
the distribution function, which converts the actor’s output into a distribution, or None if the actor does not output distribution parameters
- class ActorFactoryTransientStorageDecorator(actor_factory: ActorFactory, actor_future: ActorFuture)[source]#
Bases:
ActorFactoryWraps an actor factory, storing the most recently created actor instance such that it can be retrieved.
- create_module(envs: Environments, device: str | device) Actor | Module[source]#
- create_dist_fn(envs: Environments) Callable[[tuple[Tensor, Tensor]], Distribution] | Callable[[Tensor], Distribution] | None[source]#
- Parameters:
envs – the environments
- Returns:
the distribution function, which converts the actor’s output into a distribution, or None if the actor does not output distribution parameters
- class IntermediateModuleFactoryFromActorFactory(actor_factory: ActorFactory)[source]#
Bases:
IntermediateModuleFactory- create_intermediate_module(envs: Environments, device: str | device) IntermediateModule[source]#