critic#


class CriticFactory[source]#

Bases: ToStringMixin, ABC

Represents a factory for the generation of a critic module.

abstract create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryDefault(hidden_sizes: ~collections.abc.Sequence[int] = (64, 64), hidden_activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#

Bases: CriticFactory

A critic factory which, depending on the type of environment, creates a suitable MLP-based critic.

DEFAULT_HIDDEN_SIZES = (64, 64)#
create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryContinuousNet(hidden_sizes: ~collections.abc.Sequence[int], activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#

Bases: CriticFactory

create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryDiscreteNet(hidden_sizes: ~collections.abc.Sequence[int], activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#

Bases: CriticFactory

create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryReuseActor(actor_future: ActorFuture)[source]#

Bases: CriticFactory

A critic factory which reuses the actor’s preprocessing component.

This class is for internal use in experiment builders only.

Reuse of the actor network is supported through the concept of an actor future (ActorFuture). When the user declares that he wants to reuse the actor for the critic, we use this factory to support this, but the actor does not exist yet. So the factory instead receives the future, which will eventually be filled when the actor factory is called. When the creation method of this factory is eventually called, it can use the then-filled actor to create the critic.

Parameters:

actor_future – the object, which will hold the actor instance later when the critic is to be created

create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticEnsembleFactory[source]#

Bases: object

abstract create_module(envs: Environments, device: str | device, ensemble_size: int, use_action: bool) Module[source]#
class CriticEnsembleFactoryDefault(hidden_sizes: Sequence[int] = (64, 64))[source]#

Bases: CriticEnsembleFactory

A critic ensemble factory which, depending on the type of environment, creates a suitable MLP-based critic.

DEFAULT_HIDDEN_SIZES = (64, 64)#
create_module(envs: Environments, device: str | device, ensemble_size: int, use_action: bool) Module[source]#
class CriticEnsembleFactoryContinuousNet(hidden_sizes: Sequence[int])[source]#

Bases: CriticEnsembleFactory

create_module(envs: Environments, device: str | device, ensemble_size: int, use_action: bool) Module[source]#