NS-Gym Base Module¶
- class ns_gym.base.Reward(reward, env_change, delta_change, relative_time)[source]¶
Bases:
objectReward dataclass type. This is the output of the step function in the environment.
-
reward:
Union[int,float]¶ The reward received from the environment
-
env_change:
dict[str,bool]¶ A dictionary of boolean flags indicating what param of the environment has changed.
-
delta_change:
Optional[float]¶ The change in the reward function of the environment.
-
relative_time:
Union[int,float]¶ The relative time of the observation since the start of the environment episode.
-
reward:
- class ns_gym.base.Scheduler(start=0, end=inf)[source]¶
Bases:
ABCBase class for scheduler functions. This class is used to determine when to update a parameter in the environment.
Start and end times inclusive.
- class ns_gym.base.UpdateFn(scheduler)[source]¶
Bases:
ABCBase class for update functions that update a single parameter. Updates a scalar parameter
- Overview:
Instances of this class (and all subclasses) are callable and should be used to apply an update to a parameter. When an instance is called it executes the update logic defined in the subclass’s _update method. The __call__ method checks with the provided Scheduler to determine if an update should occur at the current time step. If an update is warranted, it invokes the _update method to modify the parameter and calculates the change in value.
- Parameters:
scheduler (Scheduler) – scheduler object that determines when to update the parameter
scheduler – scheduler object that determines when to update the parameter
- prev_param¶
The previous parameter value
- prev_time¶
The previous time the parameter was updated
- __call__(param, t)[source]¶
Update the parameter if the scheduler returns True
- Parameters:
param (Any) – The parameter to be updated
t (Union[int,float]) – The current time step
- Returns:
The updated parameter int: Binary flag indicating whether the parameter was updated or not, 1 means updated, 0 means not updated float: The amount of change in the parameter
- Return type:
Union[int, float]
- class ns_gym.base.UpdateDistributionFn(scheduler)[source]¶
Bases:
UpdateFnBase class for all update functions that update a distribution represented as a list
- class ns_gym.base.NSWrapper(env, tunable_params, change_notification=False, delta_change_notification=False, in_sim_change=False, **kwargs)[source]¶
Bases:
WrapperBase class for non-stationary wrappers
- Parameters:
env (Env) – Gym environment
tunable_params (dict[str,Union[Type[UpdateFn],Type[UpdateDistributionFn]]) – Dictionary of parameter names and their associated update functions.
change_notification (bool) – Sets a basic notification level. Returns a boolean flag to indicate whether to notify the agent of changes in the environment. Defaults to False.
delta_change_notification (bool) – Sets detailed notification levle. Returns Flag to indicate whether to notify the agent of changes in the transition function. Defaults to False.
in_sim_change (bool) – Flag to indicate whether to allow changes in the environment during simulation (e.g MCTS rollouts). Defaults to False.
- frozen¶
Flag to indicate whether the environment is frozen or not.
- Type:
bool
- is_sim_env¶
Flag to indicate whether the environment is a simulation environment or not.
- Type:
bool
- step(action, env_change, delta_change)[source]¶
Step function for the environment. Augments observations and rewards with additional information about changes in the environment and transition function.
Subclasses of this class will handle the actual environment dynamics and updating of parameters. This base class handles the notification mechanism that emulates the run-time monitor and model updater components of the decision-making infrastructure. The subclass must call this function via super().step(action, env_change, delta_change).
- Parameters:
action (int) – Action taken by the agent
env_change (dict[str,bool]) – Environment change flags. Keys are parameter names and values are boolean flags indicating whether the parameter has changed.
delta_change (dict[str,bool]) – The amount of change a parameter has undergone. Keys are parameter names and values are the amount of change.
- Returns:
observation, reward, termination flag, truncation flag, and additional information.
- Return type:
tuple[observation, Type[Reward], bool, bool, dict[str, Any]]
- reset(*, seed=None, options=None)[source]¶
Reset function for the environment. Resets the environment to its initial state and resets the time step counter.
- Parameters:
seed (int | None) – Seed for the environment. Defaults to None.
options (dict[str, Any] | None) – Additional options for the environment. Defaults to None.
- Returns:
observation and additional information.
- Return type:
tuple[Any, dict[str, Any]]
- freeze(mode=True)[source]¶
“Freezes” the current MDP so that the environment dynamics do not change. :type mode:
bool:param mode: Boolean flag indicating whether to freeze the environment or not. Defaults to True. :type mode: bool
- unfreeze()[source]¶
Unfreeze the environment dynamics for simulation.
This function “unfreezes” the current MDP so that the environment dynamics can change.
- __deepcopy__(memo)[source]¶
Keeps track of deepcopying for the environment.
If a derived class of this environement is made we set a flag to indicate that the environment is the simulation environment.
This is the intended behavior for the deepcopy function.
`python env = gym.make("FrozenLake-v1") env = NSFrozenLakeWrapper(env,updatefn,is_slippery=False) sim_env = deepcopy(env) `Then sim_env.is_sim_env will be set to True.Subclasses must implement this method.
- get_planning_env()[source]¶
Get the planning environment.
Returns a copy of the current environment in its current state but the “transition function” is set to the initial transition function. Subclasses must implement this method.
- class ns_gym.base.StableBaselineWrapper(model)[source]¶
Bases:
objectInterface for StableBaseline3 Models and NS-Gym environments. Makes it so that you can call the stable baseline functions as you would other NS_Gym agents.
- class ns_gym.base.Evaluator(*args, **kwargs)[source]¶
Bases:
ABCEvaluator base class. This class is used to evaluate the difficulty of a transition between two environments.
- ns_gym.base.SUPPORTED_GRID_WORLD_ENV_IDS = ['CliffWalking-v1', 'FrozenLake-v1']¶
Tunable parameters dictionary. Keys are environment names and values are dictionaries of parameter names and their initial values.