env#


class EnvType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Enumeration of environment types.

CONTINUOUS = 'continuous'#
DISCRETE = 'discrete'#
is_discrete() bool[source]#
is_continuous() bool[source]#
assert_continuous(requiring_entity: Any) None[source]#
assert_discrete(requiring_entity: Any) None[source]#
static from_env(env: Env) EnvType[source]#
class EnvMode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Indicates the purpose for which an environment is created.

TRAINING = 'training'#
TEST = 'test'#
WATCH = 'watch'#
class VectorEnvType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

DUMMY = 'dummy'#

Vectorized environment without parallelization; environments are processed sequentially

SUBPROC = 'subproc'#

Parallelization based on subprocess

SUBPROC_SHARED_MEM_DEFAULT_CONTEXT = 'shmem'#

Parallelization based on subprocess with shared memory

SUBPROC_SHARED_MEM_FORK_CONTEXT = 'shmem_fork'#

Parallelization based on subprocess with shared memory and fork context (relevant for macOS, which uses spawn by default https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods)

RAY = 'ray'#

Parallelization based on the ray library

SUBPROC_SHARED_MEM_AUTO = 'subproc_shared_mem_auto'#

Parallelization based on subprocess with shared memory, using default context on windows and fork context otherwise

create_venv(factories: Sequence[Callable[[], Env]]) BaseVectorEnv[source]#
class Environments(env: Env, training_envs: BaseVectorEnv, test_envs: BaseVectorEnv, watch_env: BaseVectorEnv | None = None)[source]#

Bases: ToStringMixin, ABC

Represents (vectorized) environments for a learning process.

static from_factory_and_type(factory_fn: Callable[[EnvMode], Env], env_type: EnvType, venv_type: VectorEnvType, num_training_envs: int, num_test_envs: int, create_watch_env: bool = False) Environments[source]#

Creates a suitable subtype instance from a factory function that creates a single instance and the type of environment (continuous/discrete).

Parameters:
  • factory_fn – the factory for a single environment instance

  • env_type – the type of environments created by factory_fn

  • venv_type – the vector environment type to use for parallelization

  • num_training_envs – the number of training environments to create

  • num_test_envs – the number of test environments to create

  • create_watch_env – whether to create an environment for watching the agent

Returns:

the instance

info() dict[str, Any][source]#
set_persistence(*p: Persistence) None[source]#

Associates the given persistence handlers which may persist and restore environment-specific information.

Parameters:

p – persistence handlers

abstract get_action_shape() Sequence[int] | int | int64[source]#
abstract get_observation_shape() int | Sequence[int][source]#
get_action_space() Space[source]#
get_observation_space() Space[source]#
abstract get_type() EnvType[source]#
class ContinuousEnvironments(env: Env, training_envs: BaseVectorEnv, test_envs: BaseVectorEnv, watch_env: BaseVectorEnv | None = None)[source]#

Bases: Environments

Represents (vectorized) continuous environments.

static from_factory(factory_fn: Callable[[EnvMode], Env], venv_type: VectorEnvType, num_training_envs: int, num_test_envs: int, create_watch_env: bool = False) ContinuousEnvironments[source]#

Creates an instance from a factory function that creates a single instance.

Parameters:
  • factory_fn – the factory for a single environment instance

  • venv_type – the vector environment type to use for parallelization

  • num_training_envs – the number of training environments to create

  • num_test_envs – the number of test environments to create

  • create_watch_env – whether to create an environment for watching the agent

Returns:

the instance

info() dict[str, Any][source]#
get_action_shape() Sequence[int] | int | int64[source]#
get_observation_shape() int | Sequence[int][source]#
get_type() EnvType[source]#
class DiscreteEnvironments(env: Env, training_envs: BaseVectorEnv, test_envs: BaseVectorEnv, watch_env: BaseVectorEnv | None = None)[source]#

Bases: Environments

Represents (vectorized) discrete environments.

static from_factory(factory_fn: Callable[[EnvMode], Env], venv_type: VectorEnvType, num_training_envs: int, num_test_envs: int, create_watch_env: bool = False) DiscreteEnvironments[source]#

Creates an instance from a factory function that creates a single instance.

Parameters:
  • factory_fn – the factory for a single environment instance

  • venv_type – the vector environment type to use for parallelization

  • num_training_envs – the number of training environments to create

  • num_test_envs – the number of test environments to create

  • create_watch_env – whether to create an environment for watching the agent

Returns:

the instance

get_action_shape() Sequence[int] | int | int64[source]#
get_observation_shape() int | Sequence[int][source]#
get_type() EnvType[source]#
class EnvPoolFactory[source]#

Bases: object

A factory for the creation of envpool-based vectorized environments, which can be used in conjunction with EnvFactoryRegistered.

create_venv(task: str, num_envs: int, mode: EnvMode, seed: int, kwargs: dict) BaseVectorEnv[source]#
class EnvFactory(venv_type: VectorEnvType)[source]#

Bases: ToStringMixin, ABC

Main interface for the creation of environments (in various forms).

Parameters:

venv_type – the type of vectorized environment to use for train and test environments. WATCH environments are always created as DUMMY vector environments.

create_env(mode: EnvMode, seed: int | None = None) Env[source]#

Creates a single environment for the given mode.

Parameters:
  • mode – the mode

  • seed – the random seed to use for the environment; if None, the seed will not be specified, and gymnasium will use a random seed.

Returns:

the environment

create_venv(num_envs: int, mode: EnvMode, seed: int | None = None) BaseVectorEnv[source]#

Create vectorized environments.

Parameters:
  • num_envs – the number of environments

  • mode – the mode for which to create. In WATCH mode the resulting venv will always be of type DUMMY with a single env.

Returns:

the vectorized environments

create_envs(num_training_envs: int, num_test_envs: int, create_watch_env: bool = False, seed: int | None = None) Environments[source]#

Create environments for learning.

Parameters:
  • num_training_envs – the number of training environments

  • num_test_envs – the number of test environments

  • create_watch_env – whether to create an environment for watching the agent

  • seed – the random seed to use for environment creation

Returns:

the environments

class EnvFactoryRegistered(*, task: str, venv_type: VectorEnvType, envpool_factory: EnvPoolFactory | None = None, render_mode_training: str | None = None, render_mode_test: str | None = None, render_mode_watch: str = 'human', **make_kwargs: Any)[source]#

Bases: EnvFactory

Factory for environments that are registered with gymnasium and thus can be created via gymnasium.make (or via envpool.make_gymnasium).

Parameters:
  • task – the gymnasium task/environment identifier

  • seed – the random seed

  • venv_type – the type of vectorized environment to use (if envpool_factory is not specified)

  • envpool_factory – the factory to use for vectorized environment creation based on envpool; envpool must be installed.

  • render_mode_training – the render mode to use for training environments

  • render_mode_test – the render mode to use for test environments

  • render_mode_watch – the render mode to use for environments that are used to watch agent performance

  • make_kwargs – additional keyword arguments to pass on to gymnasium.make. If envpool is used, the gymnasium parameters will be appropriately translated for use with envpool.make_gymnasium.

create_venv(num_envs: int, mode: EnvMode, seed: int | None = None) BaseVectorEnv[source]#

Create vectorized environments.

Parameters:
  • num_envs – the number of environments

  • mode – the mode for which to create. In WATCH mode the resulting venv will always be of type DUMMY with a single env.

Returns:

the vectorized environments