Models¶

Model workers wrap a Python prediction function and run it against an RLMesh environment endpoint. The framework backend controls how observations are decoded before predict_fn runs and how returned actions are encoded.

Base Model¶

class rlmesh._models.base.ModelBase[source]¶

Bases: Generic[ObsT, ActT]

A model: a predict callable driven against an env.

run(env, seeds=...) drives the model against an env and returns a typed RunResult – it resolves the adapter from the env’s published tags and this model’s spec, so predict works in the model’s own input/output format with no per-env glue. serve() hosts the model as an endpoint for the runtime to dial.

Parameters:

source – A predict callable.
spec – Optional rlmesh.adapters.ModelSpec; makes this an adapted model.
on_close (on_reset / on_episode_end /) – Optional lifecycle callbacks.
load_kwargs (artifacts /) – Accepted for forward compatibility; unused for a callable source.
trust_entrypoints – Allow module:callable custom-input entrypoints in a spec to be imported during adapter resolution.

Examples

>>> from rlmesh.numpy import Model
>>> result = Model(lambda observation: 0).run("127.0.0.1:5555", seeds=[0])
>>> result.mean_reward
0.0

__init__(source, *, spec=None, on_reset=None, on_episode_end=None, on_close=None, artifacts=(), load_kwargs=None, trust_entrypoints=False)[source]¶

Parameters:

source (Callable[..., object] | object)
spec (object | None)
on_reset (LifecycleCallback | None)
on_episode_end (LifecycleCallback | None)
on_close (LifecycleCallback | None)
artifacts (Sequence[ArtifactInput])
load_kwargs (Mapping[str, object] | None)
trust_entrypoints (bool)

Return type:

None

property spec: object | None[source]¶

a ModelSpec, DELEGATED, or None.

Type:: The model’s content

run(env_or_address, *, seeds=None, max_episodes=None, instruction=None, close_env=False, token='')[source]¶

Drive this model against an env and return a RunResult.

Resolves the adapter from the env’s tags and this model’s spec, then runs a per-episode loop. seeds gives a per-episode seed and sets the episode count unless max_episodes is given; instruction is written into the model’s text inputs each episode. env_or_address is an env object exposing reset/step (e.g. a RemoteEnv), an object with an address, or a bare address string the loop dials.

Parameters:

env_or_address (object)
seeds (Sequence[int] | None)
max_episodes (int | None)
instruction (str | None)
close_env (bool)
token (str)

Return type:

RunResult

serve(address, *, token='', options=None)[source]¶

Host this model as an endpoint (blocking).

A spec’d model resolves its adapter per route from the env contract the configure_route handshake delivers, then applies it around predict; a spec-less / DELEGATED model serves its own predict directly.

Parameters:

address (str)
token (str)
options (ServeOptions | None)

Return type:

None

run_local(env_address, *, token='')[source]¶

Native worker loop against a remote env, until interrupted (no metrics).

Parameters:

env_address (str)
token (str)

Return type:

None

run_local_for_episodes(env_address, *, token='', max_episodes)[source]¶

Native worker loop against a remote env for a fixed episode count (no metrics).

Parameters:

env_address (str)
token (str)
max_episodes (int)

Return type:

None

Concrete Models¶

Concrete backend model classes inherit ModelBase and only change value conversion:

Class	Import	Observation type	Action encoding
Native model	`rlmesh.Model`	RLMesh-native values and primitives	RLMesh-native values
NumPy model	`rlmesh.numpy.Model`	NumPy arrays, primitives, and containers	NumPy arrays and primitives
Torch model	`rlmesh.torch.Model`	Torch tensors, primitives, and containers	Torch tensors and primitives
JAX model	`rlmesh.jax.Model`	JAX arrays, primitives, and containers	JAX arrays and primitives

See NumPy, Torch, and JAX for backend helpers.