Models

Model workers wrap a Python prediction function and run it against an RLMesh environment endpoint. The framework backend controls how observations are decoded before predict_fn runs and how returned actions are encoded.

Base Model

class rlmesh._models.base.ModelBase[source]

Bases: Generic[ObsT, ActT]

A model: a predict callable driven against an env.

run(env, seeds=...) drives the model against an env and returns a typed RunResult – it resolves the adapter from the env’s published tags and this model’s spec, so predict works in the model’s own input/output format with no per-env glue. serve() hosts the model as an endpoint for the runtime to dial.

Parameters:
  • source – A predict callable.

  • spec – Optional rlmesh.adapters.ModelSpec; makes this an adapted model.

  • on_close (on_reset / on_episode_end /) – Optional lifecycle callbacks.

  • load_kwargs (artifacts /) – Accepted for forward compatibility; unused for a callable source.

  • trust_entrypoints – Allow module:callable custom-input entrypoints in a spec to be imported during adapter resolution.

Examples

>>> from rlmesh.numpy import Model
>>> result = Model(lambda observation: 0).run("127.0.0.1:5555", seeds=[0])
>>> result.mean_reward
0.0
__init__(source, *, spec=None, on_reset=None, on_episode_end=None, on_close=None, artifacts=(), load_kwargs=None, trust_entrypoints=False)[source]
Parameters:
  • source (Callable[..., object] | object)

  • spec (object | None)

  • on_reset (LifecycleCallback | None)

  • on_episode_end (LifecycleCallback | None)

  • on_close (LifecycleCallback | None)

  • artifacts (Sequence[ArtifactInput])

  • load_kwargs (Mapping[str, object] | None)

  • trust_entrypoints (bool)

Return type:

None

property spec: object | None[source]

a ModelSpec, DELEGATED, or None.

Type:

The model’s content

run(env_or_address, *, seeds=None, max_episodes=None, instruction=None, close_env=False, token='')[source]

Drive this model against an env and return a RunResult.

Resolves the adapter from the env’s tags and this model’s spec, then runs a per-episode loop. seeds gives a per-episode seed and sets the episode count unless max_episodes is given; instruction is written into the model’s text inputs each episode. env_or_address is an env object exposing reset/step (e.g. a RemoteEnv), an object with an address, or a bare address string the loop dials.

Parameters:
  • env_or_address (object)

  • seeds (Sequence[int] | None)

  • max_episodes (int | None)

  • instruction (str | None)

  • close_env (bool)

  • token (str)

Return type:

RunResult

serve(address, *, token='', options=None)[source]

Host this model as an endpoint (blocking).

A spec’d model resolves its adapter per route from the env contract the configure_route handshake delivers, then applies it around predict; a spec-less / DELEGATED model serves its own predict directly.

Parameters:
Return type:

None

run_local(env_address, *, token='')[source]

Native worker loop against a remote env, until interrupted (no metrics).

Parameters:
  • env_address (str)

  • token (str)

Return type:

None

run_local_for_episodes(env_address, *, token='', max_episodes)[source]

Native worker loop against a remote env for a fixed episode count (no metrics).

Parameters:
  • env_address (str)

  • token (str)

  • max_episodes (int)

Return type:

None

Concrete Models

Concrete backend model classes inherit ModelBase and only change value conversion:

Class

Import

Observation type

Action encoding

Native model

rlmesh.Model

RLMesh-native values and primitives

RLMesh-native values

NumPy model

rlmesh.numpy.Model

NumPy arrays, primitives, and containers

NumPy arrays and primitives

Torch model

rlmesh.torch.Model

Torch tensors, primitives, and containers

Torch tensors and primitives

JAX model

rlmesh.jax.Model

JAX arrays, primitives, and containers

JAX arrays and primitives

See NumPy, Torch, and JAX for backend helpers.