atari_wrapper#
Source code: tianshou/env/atari/atari_wrapper.py
- class NoopResetEnv(env: Env, noop_max: int = 30)[source]#
Bases:
WrapperSample initial states by taking random number of no-ops on reset.
No-op is assumed to be action 0.
- Parameters:
env (gym.Env) – the environment to wrap.
noop_max (int) – the maximum value of no-ops to run.
Wraps an environment to allow a modular transformation of the
step()andreset()methods.- Args:
env: The environment to wrap
- class MaxAndSkipEnv(env: Env, skip: int = 4)[source]#
Bases:
WrapperReturn only every skip-th frame (frameskipping) using most recent raw observations (for max pooling across time steps).
- Parameters:
env (gym.Env) – the environment to wrap.
skip (int) – number of skip-th frame.
Wraps an environment to allow a modular transformation of the
step()andreset()methods.- Args:
env: The environment to wrap
- class EpisodicLifeEnv(env: Env)[source]#
Bases:
WrapperMake end-of-life == end-of-episode, but only reset on true game over.
It helps the value estimation.
- Parameters:
env (gym.Env) – the environment to wrap.
Wraps an environment to allow a modular transformation of the
step()andreset()methods.- Args:
env: The environment to wrap
- class FireResetEnv(env: Env)[source]#
Bases:
WrapperTake action on reset for environments that are fixed until firing.
Related discussion: openai/baselines#240.
- Parameters:
env (gym.Env) – the environment to wrap.
Wraps an environment to allow a modular transformation of the
step()andreset()methods.- Args:
env: The environment to wrap
- class WarpFrame(env: Env)[source]#
Bases:
ObservationWrapperWarp frames to 84x84 as done in the Nature paper and later work.
- Parameters:
env (gym.Env) – the environment to wrap.
Constructor for the observation wrapper.
- class ScaledFloatFrame(env: Env)[source]#
Bases:
ObservationWrapperNormalize observations to 0~1.
- Parameters:
env (gym.Env) – the environment to wrap.
Constructor for the observation wrapper.
- class ClipRewardEnv(env: Env)[source]#
Bases:
RewardWrapperclips the reward to {+1, 0, -1} by its sign.
- Parameters:
env (gym.Env) – the environment to wrap.
Constructor for the Reward wrapper.
- class FrameStack(env: Env, n_frames: int)[source]#
Bases:
WrapperStack n_frames last frames.
- Parameters:
env (gym.Env) – the environment to wrap.
n_frames (int) – the number of frames to stack.
Wraps an environment to allow a modular transformation of the
step()andreset()methods.- Args:
env: The environment to wrap
- wrap_deepmind(env: Env, episode_life: bool = True, clip_rewards: bool = True, frame_stack: int = 4, scale: bool = False, warp_frame: bool = True) MaxAndSkipEnv | EpisodicLifeEnv | FireResetEnv | WarpFrame | ScaledFloatFrame | ClipRewardEnv | FrameStack[source]#
Configure environment for DeepMind-style Atari.
The observation is channel-first: (c, h, w) instead of (h, w, c).
- Parameters:
env – the Atari environment to wrap.
episode_life (bool) – wrap the episode life wrapper.
clip_rewards (bool) – wrap the reward clipping wrapper.
frame_stack (int) – wrap the frame stacking wrapper.
scale (bool) – wrap the scaling observation wrapper.
warp_frame (bool) – wrap the grayscale + resize observation wrapper.
- Returns:
the wrapped atari environment.
- make_atari_env(task: str, seed: int, num_training_envs: int, num_test_envs: int, scale: int | bool = False, frame_stack: int = 4) tuple[Env, BaseVectorEnv, BaseVectorEnv][source]#
Wrapper function for Atari env.
If EnvPool is installed, it will automatically switch to EnvPool’s Atari env.
- Returns:
a tuple of (single env, training envs, test envs).
- class AtariEnvFactory(task: str, frame_stack: int, scale: bool = False, use_envpool_if_available: bool = True, venv_type: VectorEnvType = VectorEnvType.SUBPROC_SHARED_MEM_AUTO)[source]#
Bases:
EnvFactoryRegistered- Parameters:
task – the gymnasium task/environment identifier
seed – the random seed
venv_type – the type of vectorized environment to use (if envpool_factory is not specified)
envpool_factory – the factory to use for vectorized environment creation based on envpool; envpool must be installed.
render_mode_training – the render mode to use for training environments
render_mode_test – the render mode to use for test environments
render_mode_watch – the render mode to use for environments that are used to watch agent performance
make_kwargs – additional keyword arguments to pass on to gymnasium.make. If envpool is used, the gymnasium parameters will be appropriately translated for use with envpool.make_gymnasium.
- class EnvPoolFactoryAtari(parent: AtariEnvFactory)[source]#
Bases:
EnvPoolFactoryAtari-specific envpool creation. Since envpool internally handles the functions that are implemented through the wrappers in wrap_deepmind, it sets the creation keyword arguments accordingly.
- class AtariEpochStopCallback(task: str)[source]#
Bases:
EpochStopCallback- should_stop(mean_rewards: float, context: TrainingContext) bool[source]#
Determines whether training should stop.
- Parameters:
mean_rewards – the average undiscounted returns of the testing result
context – the training context
- Returns:
True if the goal has been reached and training should stop, False otherwise