persistence#


class PersistEvent(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Enumeration of persistence events that Persistence objects can react to.

PERSIST_POLICY = 'persist_policy'#

Policy neural network is persisted (new best found)

class RestoreEvent(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Enumeration of restoration events that Persistence objects can react to.

RESTORE_POLICY = 'restore_policy'#

Policy neural network parameters are restored

class Persistence[source]#

Bases: ABC

abstract persist(event: PersistEvent, world: World) None[source]#
abstract restore(event: RestoreEvent, world: World) None[source]#
class PersistenceGroup(*p: Persistence, enabled: bool = True)[source]#

Bases: Persistence

Groups persistence handler such that they can be applied collectively.

persist(event: PersistEvent, world: World) None[source]#
restore(event: RestoreEvent, world: World) None[source]#
class PolicyPersistence(additional_persistence: Persistence | None = None, enabled: bool = True, mode: Mode = Mode.POLICY)[source]#

Bases: object

Handles persistence of the policy.

Parameters:
  • additional_persistence – a persistence instance which is to be invoked whenever this object is used to persist/restore data

  • enabled – whether persistence is enabled (restoration is always enabled)

  • mode – the persistence mode

class Mode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Mode of persistence.

POLICY_STATE_DICT = 'policy_state_dict'#

Persist only the policy’s state dictionary. Note that for a policy to be restored from such a dictionary, it is necessary to first create a structurally equivalent object which can accept the respective state.

POLICY = 'policy'#

Persist the entire policy. This is larger but has the advantage of the policy being loadable without requiring an environment to be instantiated. It has the potential disadvantage that upon breaking code changes in the policy implementation (e.g. renamed/moved class), it will no longer be loadable. Note that a precondition is that the policy be picklable in its entirety.

get_filename() str[source]#
persist(policy: Module, world: World) None[source]#
restore(policy: Module, world: World, device: TDevice) None[source]#
get_save_best_fn(world: World) Callable[[Module], None][source]#
get_save_checkpoint_fn(world: World) Callable[[int, int, int], str] | None[source]#