icm

icm#

Source code: tianshou/policy/modelbased/icm.py

class ICMTrainingStats(wrapped_stats: TrainingStats, *, icm_loss: float, icm_forward_loss: float, icm_inverse_loss: float)[source]#

Bases: TrainingStatsWrapper

In this particular case, super().__init__() should be called LAST in the subclass init.

class ICMPolicy(*, policy: BasePolicy[TTrainingStats], model: IntrinsicCuriosityModule, optim: Optimizer, lr_scale: float, reward_scale: float, forward_loss_weight: float, action_space: Space, observation_space: Space | None = None, action_scaling: bool = False, action_bound_method: Literal['clip', 'tanh'] | None = 'clip', lr_scheduler: LRScheduler | MultipleLRSchedulers | None = None)[source]#

Bases: BasePolicy[ICMTrainingStats]

Implementation of Intrinsic Curiosity Module. arXiv:1705.05363.

Parameters:

policy – a base policy to add ICM to.
model – the ICM model.
optim – a torch.optim for optimizing the model.
lr_scale – the scaling factor for ICM learning.
forward_loss_weight – the weight for forward model loss.
observation_space – Env’s observation space.
action_scaling – if True, scale the action from [-1, 1] to the range of action_space. Only used if the action_space is continuous.
action_bound_method – method to bound action to range [-1, 1]. Only used if the action_space is continuous.
lr_scheduler – if not None, will be called in policy.update().

icm

Contents

icm#