Preprocessors¶
keras_gym.wrappers.BoxActionsToReals |
This wrapper decompactifies a Box action space to the reals. |
keras_gym.wrappers.ImagePreprocessor |
Preprocessor for images. |
keras_gym.wrappers.FrameStacker |
Stack multiple frames into one state observation. |
-
class
keras_gym.wrappers.
BoxActionsToReals
(env)[source]¶ This wrapper decompactifies a
Box
action space to the reals. This is required in order to be able to use aGaussianPolicy
.In practice, the wrapped environment expects the input action \(a_\text{real}\in\mathbb{R}^n\) and then it compactifies it back to a Box of the right size:
\[a_\text{box}\ =\ \text{low} + (\text{high}-\text{low}) \times\text{sigmoid}(a_\text{real})\]Technically, the transformed space is still a Box, but that’s only because we assume that the values lie between large but finite bounds, \(a_\text{real}\in[10^{-15}, 10^{15}]^n\).
-
close
(self)¶ Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
-
render
(self, mode='human', **kwargs)¶ Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
- human: render to the current display or terminal and return nothing. Usually for human consumption.
- rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
- Note:
- Make sure that your class’s metadata ‘render.modes’ key includes
- the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Args:
- mode (str): the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
- return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
- … # pop up a window and render
- else:
- super(MyEnv, self).render(mode=mode) # just raise an exception
-
reset
(self, **kwargs)¶ Resets the state of the environment and returns an initial observation.
- Returns:
- observation (object): the initial observation.
-
seed
(self, seed=None)¶ Sets the seed for this env’s random number generator(s).
- Note:
- Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.
- Returns:
- list<bigint>: Returns the list of seeds used in this env’s random
- number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.
-
step
(self, a)[source]¶ Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
- Args:
- action (object): an action provided by the agent
- Returns:
- observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
-
unwrapped
¶ Completely unwrap this env.
- Returns:
- gym.Env: The base non-wrapped gym.Env instance
-
-
class
keras_gym.wrappers.
ImagePreprocessor
(env, height, width, grayscale=True, assert_input_shape=None)[source]¶ Preprocessor for images.
This preprocessing is adapted from this blog post:
Parameters: - env : gym environment
A gym environment.
- height : positive int
Output height (number of pixels).
- width : positive int
Output width (number of pixels).
- grayscale : bool, optional
Whether to convert RGB image to grayscale.
- assert_input_shape : shape tuple, optional
If provided, the preprocessor will assert the given input shape.
-
close
(self)¶ Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
-
render
(self, mode='human', **kwargs)¶ Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
- human: render to the current display or terminal and return nothing. Usually for human consumption.
- rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
- Note:
- Make sure that your class’s metadata ‘render.modes’ key includes
- the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Args:
- mode (str): the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
- return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
- … # pop up a window and render
- else:
- super(MyEnv, self).render(mode=mode) # just raise an exception
-
reset
(self)[source]¶ Resets the state of the environment and returns an initial observation.
- Returns:
- observation (object): the initial observation.
-
seed
(self, seed=None)¶ Sets the seed for this env’s random number generator(s).
- Note:
- Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.
- Returns:
- list<bigint>: Returns the list of seeds used in this env’s random
- number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.
-
step
(self, a)[source]¶ Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
- Args:
- action (object): an action provided by the agent
- Returns:
- observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
-
unwrapped
¶ Completely unwrap this env.
- Returns:
- gym.Env: The base non-wrapped gym.Env instance
-
class
keras_gym.wrappers.
FrameStacker
(env, num_frames=4)[source]¶ Stack multiple frames into one state observation.
Parameters: - env : gym environment
A gym environment.
- num_frames : positive int, optional
Number of frames to stack in order to build a state feature vector.
-
close
(self)¶ Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
-
render
(self, mode='human', **kwargs)¶ Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
- human: render to the current display or terminal and return nothing. Usually for human consumption.
- rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
- Note:
- Make sure that your class’s metadata ‘render.modes’ key includes
- the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Args:
- mode (str): the mode to render with
Example:
- class MyEnv(Env):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
- return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
- … # pop up a window and render
- else:
- super(MyEnv, self).render(mode=mode) # just raise an exception
-
reset
(self)[source]¶ Resets the state of the environment and returns an initial observation.
- Returns:
- observation (object): the initial observation.
-
seed
(self, seed=None)¶ Sets the seed for this env’s random number generator(s).
- Note:
- Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.
- Returns:
- list<bigint>: Returns the list of seeds used in this env’s random
- number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.
-
step
(self, a)[source]¶ Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
- Args:
- action (object): an action provided by the agent
- Returns:
- observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
-
unwrapped
¶ Completely unwrap this env.
- Returns:
- gym.Env: The base non-wrapped gym.Env instance