AbstractEnvironment

Overview

What follows is the AbstractEnvironment interface in its entirety. For users implementing new environments, only a subset of the methods discussed below are required. The remaining methods are built off of that subset and should not be implemented directly. Some of the required methods may have defaults.

Required Methods

  • State

    • statespace(env)
    • getstate!(state, env)
    • setstate!(env, state)
  • Observation

    • obsspace(env)
    • getobs!(obs, env)
  • Action

    • actionspace(env)
    • getaction!(action, env)
    • setaction!(env, action)
  • Reward

    • rewardspace(env)
    • getreward(env)
  • Evaluation

    • evalspace(env)
    • geteval(env)
  • Simulation

    • reset!(env)
    • randreset!(env)
    • step!(env)
    • isdone(env)
    • timestep(env)
    • Base.time(env)

API

State

LyceumBase.statespaceFunction
statespace(env::AbstractEnvironment) --> Shapes.AbstractShape

Returns a subtype of Shapes.AbstractShape describing the state space of env.

See also: getstate!, setstate!, getstate.

statespace(sim::MJSim) -> Any

Return a description of sim's statespace.

LyceumBase.getstate!Function
getstate!(state, env::AbstractEnvironment)

Store the current state of env in state, where state conforms to the state space returned by statespace(env).

See also: statespace, setstate!, getstate.

getstate!(state, sim)

Copy the following state fields from sim.d into state:

(time, qpos, qvel, act, mocap_pos, mocap_quat, userdata, qacc_warmstart)
LyceumBase.setstate!Function
setstate!(env::AbstractEnvironment, state)

Set the state of env to state, where state conforms to the state space returned by statespace(env).

See also: statespace, getstate!, getstate.

Note

Implementers of custom AbstractEnvironment subtypes must guarantee that calls to other "getter" functions (e.g. getreward) after a call to setstate! reflect the new, passed-in state.

setstate!(sim, state)

Copy the components of state to their respective fields in sim.d, namely:

(time, qpos, qvel, act, mocap_pos, mocap_quat, userdata, qacc_warmstart)
LyceumBase.getstateFunction
getstate(env::AbstractEnvironment)

Get the current state of env. The returned value will be an object conforming to the state space returned by statespace(env).

See also: statespace, getstate!, setstate!.

Note

Implementers of custom AbstractEnvironment subtypes should implement statespace and getstate!, which are used internally by getstate.

getstate(sim)

Return a flattened vector of the following state fields from sim.d:

(time, qpos, qvel, act, mocap_pos, mocap_quat, userdata, qacc_warmstart)

Observation

LyceumBase.obsspaceFunction
obsspace(env::AbstractEnvironment) --> Shapes.AbstractShape

Returns a subtype of Shapes.AbstractShape describing the observation space of env.

See also: getobs!, getobs.

obsspace(sim::MJSim) -> Any

Return a description of sim's observation space.

LyceumBase.getobs!Function
getobs!(obs, env::AbstractEnvironment)

Store the current observation of env in obs, where obs conforms to the observation space returned by obsspace(env).

See also: obsspace, getobs.

getobs!(obs, sim)

Copy sim.d.sensordata into obs.

LyceumBase.getobsFunction
getobs(env::AbstractEnvironment)

Get the current observation of env. The returned value will be an object conforming to the observation space returned by obsspace(env).

See also: obsspace, getobs!.

Note

Implementers of custom AbstractEnvironment subtypes should implement obsspace and getobs!, which are used internally by getobs.

getobs(sim::MJSim) -> Any

Return a copy of sim.d.sensordata.

Action

LyceumBase.actionspaceFunction
actionspace(env::AbstractEnvironment) --> Shapes.AbstractShape

Returns a subtype of Shapes.AbstractShape describing the action space of env.

See also: getaction!, setaction!, getaction.

actionspace(sim::MJSim) -> Any

Return a description of sim's action space.

LyceumBase.getaction!Function
getaction!(action, env::AbstractEnvironment)

Store the current action of env in action, where action conforms to the action space returned by actionspace(env).

See also: actionspace, setaction!, getaction.

getaction!(action, policy, feature)

Treating policy as a deterministic policy, compute the mean action of policy, conditioned on feature, and store it in action.

getaction!(action, state, m; nthreads)

Starting from the environment's state, perform one step of the MPPI algorithm and store the resulting action in action. The trajectory sampling portion of MPPI is done in parallel using nthreads threads.

getaction!(action::AbstractArray{#s12,1} where #s12<:Real, sim::MJSim) -> Any

Copy sim.d.ctrl into action.

LyceumBase.setaction!Function
setaction!(env::AbstractEnvironment, action)

Set the action of env to action, where action conforms to the action space returned by actionspace(env).

See also: actionspace, getaction!, getaction.

Note

Implementers of custom AbstractEnvironment subtypes must guarantee that calls to other "getter" functions (e.g. getreward) after a call to setaction! reflect the new, passed-in action.

setaction!(sim::MJSim, action::AbstractArray{#s12,1} where #s12<:Real) -> MJSim

Copy action into sim.d.ctrl into action and compute the new forward dynamics.

LyceumBase.getactionFunction
getaction(env::AbstractEnvironment)

Get the current action of env. The returned value will be an object conforming to the action space returned by actionspace(env).

See also: actionspace, getaction!, setaction!.

Note

Implementers of custom AbstractEnvironment subtypes should implement actionspace and getaction!, which are used internally by getaction.

getaction(sim::MJSim, action::AbstractArray{#s12,1} where #s12<:Real) -> Any

Return a copy of sim.d.ctrl.

Reward

LyceumBase.rewardspaceFunction
rewardspace(env::AbstractEnvironment) --> Shapes.AbstractShape

Returns a subtype of Shapes.AbstractShape describing the reward space of env. Defaults to Shapes.ScalarShape{Float64}().

See also: getreward.

Note

Currently, only scalar spaces are supported (e.g. Shapes.ScalarShape).

LyceumBase.getrewardFunction
getreward(state, action, observation, env::AbstractEnvironment)

Get the current reward of env as a function of state, action, and observation. The returned value will be an object conforming to the reward space returned by rewardspace(env).

See also: rewardspace.

Note

Currently, only scalar rewards are supported, so there is no in-place getreward!.

Note

Implementers of custom AbstractEnvironment subtypes should be careful to ensure that the result of getreward is purely a function of state/action/observation and not any internal, dynamic state contained in env.

getreward(env::AbstractEnvironment)

Get the current reward of env.

Internally calls getreward(getstate(env), getaction(env), getobs(env), env).

See also: rewardspace.

Note

Implementers of custom AbstractEnvironment subtypes should implement getreward(state, action, observation, env).

Evaluation

LyceumBase.evalspaceFunction
evalspace(env::AbstractEnvironment) --> Shapes.AbstractShape

Returns a subtype of Shapes.AbstractShape describing the evaluation space of env. Defaults to Shapes.ScalarShape{Float64}().

See also: geteval.

Note

Currently, only scalar evaluation spaces are supported (e.g. Shapes.ScalarShape).

LyceumBase.getevalFunction
geteval(state, action, observation, env::AbstractEnvironment)

Get the current evaluation metric of env as a function of state, action, and observation. The returned value will be an object conforming to the evaluation space returned by evalspace(env).

Often times reward functions are heavily "shaped" and hard to interpret. For example, the reward function for bipedal walking may include root pose, ZMP terms, control costs, etc., while success can instead be simply evaluated by distance of the root along an axis. The evaluation metric serves to fill this gap.

The default behavior is to return getreward(state, action, observation, env::AbstractEnvironment).

See also: evalspace.

Note

Currently, only scalar evaluation metrics are supported, so there is no in-place geteval!.

Note

Implementers of custom AbstractEnvironment subtypes should be careful to ensure that the result of geteval is purely a function of state/action/observation and not any internal, dynamic state contained in env.

geteval(env::AbstractEnvironment)

Get the current evaluation metric of env.

Internally calls geteval(getstate(env), getaction(env), getobs(env), env).

See also: evalspace.

Note

Implementers of custom AbstractEnvironment subtypes should implement geteval(state, action, obs, env).

Simulation

LyceumBase.reset!Function
reset!(env::AbstractEnvironment)

Reset env to a fixed, initial state with zero/passive controls.

reset!(m::LyceumAI.MPPI) -> LyceumAI.MPPI

Resets the canonical control vector to zeros.

LyceumBase.randreset!Function
randreset!([rng::Random.AbstractRNG, ], env::AbstractEnvironment)

Reset env to a random state with zero/passive controls.

Note

Implementers of custom AbstractEnvironment subtypes should implement randreset!(rng, env).

LyceumBase.step!Function
step!(env::AbstractEnvironment)

Advance env forward by one timestep.

See also: timestep.

step!(sim::MJSim) -> MJSim
step!(sim::MJSim, skip::Integer) -> MJSim

Step the simulation by skip steps, where skip defaults to sim.skip.

State-dependent controls (e.g. the ctrl, xfrc_applied, qfrc_applied fields of sim.d) should be set before calling step!.

LyceumBase.isdoneFunction
isdone(state, action, observation, env::AbstractEnvironment) --> Bool

Returns true if state, action, and observation meet an early termination condition for env. Defaults to false.

Note

Implementers of custom AbstractEnvironment subtypes should be careful to ensure that the result of isdone is purely a function of state/action/observation and not any internal, dynamic state contained in env.

isdone(env::AbstractEnvironment)

Returns true if env has met an early termination condition.

Internally calls isdone(getstate(env), getaction(env), getobs(env), env).

Note

Implementers of custom AbstractEnvironment subtypes should implement isdone(state, action, obs, env).

LyceumBase.timestepFunction
timestep(env::AbstractEnvironment)

Return the internal simulation timestep, in seconds, of env.

See also: Base.time.

Examples

env = FooEnv()
reset!(env)
t1 = time(env)
step!(env)
t2 = time(env)
@assert timestep(env) == (t2 - t1)
timestep(sim::MJSim) -> Float64

Return the effective timestep of sim. Equivalent to sim.skip * sim.m.opt.timestep.

Base.Libc.timeFunction
Base.time(env::AbstractEnvironment)

Returns the current simulation time, in seconds, of env. By convention, time(env) should return zero after a call to reset!(env) or randreset!(env).

See also: timestep.

time(sim::MJSim) -> Float64

Return the current simulation time, in seconds, of sim. Equivalent to sim.d.time.