Preprocessors¶

`keras_gym.wrappers.BoxActionsToReals`	This wrapper decompactifies a `Box` action space to the reals.
`keras_gym.wrappers.ImagePreprocessor`	Preprocessor for images.
`keras_gym.wrappers.FrameStacker`	Stack multiple frames into one state observation.

class keras_gym.wrappers.BoxActionsToReals(env)[source]¶

This wrapper decompactifies a Box action space to the reals. This is required in order to be able to use a GaussianPolicy.

In practice, the wrapped environment expects the input action \(a_\text{real}\in\mathbb{R}^n\) and then it compactifies it back to a Box of the right size:

\[a_\text{box}\ =\ \text{low} + (\text{high}-\text{low}) \times\text{sigmoid}(a_\text{real})\]

Technically, the transformed space is still a Box, but that’s only because we assume that the values lie between large but finite bounds, \(a_\text{real}\in[10^{-15}, 10^{15}]^n\).

close(self)¶

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

render(self, mode='human', **kwargs)¶

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note:

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Args:

mode (str): the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reset(self, **kwargs)¶

Resets the state of the environment and returns an initial observation.

Returns:: observation (object): the initial observation.

seed(self, seed=None)¶

Sets the seed for this env’s random number generator(s).

Note:

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:

list<bigint>: Returns the list of seeds used in this env’s random: number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

step(self, a)[source]¶

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Args:: action (object): an action provided by the agent
Returns:: observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

unwrapped¶

Completely unwrap this env.

Returns:: gym.Env: The base non-wrapped gym.Env instance

class keras_gym.wrappers.ImagePreprocessor(env, height, width, grayscale=True, assert_input_shape=None)[source]¶

Preprocessor for images.

This preprocessing is adapted from this blog post:

https://becominghuman.ai/lets-build-an-atari-ai-part-1-dqn-df57e8ff3b26

Parameters:	env : gym environment A gym environment. height : positive int Output height (number of pixels). width : positive int Output width (number of pixels). grayscale : bool, optional Whether to convert RGB image to grayscale. assert_input_shape : shape tuple, optional If provided, the preprocessor will assert the given input shape.

close(self)¶

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

render(self, mode='human', **kwargs)¶

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note:

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Args:

mode (str): the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reset(self)[source]¶

Resets the state of the environment and returns an initial observation.

Returns:: observation (object): the initial observation.

seed(self, seed=None)¶

Sets the seed for this env’s random number generator(s).

Note:

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:

list<bigint>: Returns the list of seeds used in this env’s random: number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

step(self, a)[source]¶

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Args:: action (object): an action provided by the agent
Returns:: observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

unwrapped¶

Completely unwrap this env.

Returns:: gym.Env: The base non-wrapped gym.Env instance

class keras_gym.wrappers.FrameStacker(env, num_frames=4)[source]¶

Stack multiple frames into one state observation.

Parameters:	env : gym environment A gym environment. num_frames : positive int, optional Number of frames to stack in order to build a state feature vector.

close(self)¶

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

render(self, mode='human', **kwargs)¶

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note:

Make sure that your class’s metadata ‘render.modes’ key includes: the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Args:

mode (str): the mode to render with

Example:

class MyEnv(Env):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):

if mode == ‘rgb_array’:: return np.array(…) # return RGB frame suitable for video
elif mode == ‘human’:: … # pop up a window and render
else:: super(MyEnv, self).render(mode=mode) # just raise an exception

reset(self)[source]¶

Resets the state of the environment and returns an initial observation.

Returns:: observation (object): the initial observation.

seed(self, seed=None)¶

Sets the seed for this env’s random number generator(s).

Note:

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns:

list<bigint>: Returns the list of seeds used in this env’s random: number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

step(self, a)[source]¶

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Args:: action (object): an action provided by the agent
Returns:: observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

unwrapped¶

Completely unwrap this env.

Returns:: gym.Env: The base non-wrapped gym.Env instance