Models¶
Model workers wrap a Python prediction function and run it against an RLMesh environment endpoint.
The framework backend controls how observations are decoded before predict_fn runs and how
returned actions are encoded.
Base Model¶
- class rlmesh._models.base.ModelBase[source]¶
Bases:
Generic[ObsT,ActT]A model: a predict callable driven against an env.
run(env, seeds=...)drives the model against an env and returns a typedRunResult– it resolves the adapter from the env’s published tags and this model’s spec, sopredictworks in the model’s own input/output format with no per-env glue.serve()hosts the model as an endpoint for the runtime to dial.- Parameters:
source – A predict callable.
spec – Optional
rlmesh.adapters.ModelSpec; makes this an adapted model.on_close (on_reset / on_episode_end /) – Optional lifecycle callbacks.
load_kwargs (artifacts /) – Accepted for forward compatibility; unused for a callable source.
trust_entrypoints – Allow
module:callablecustom-input entrypoints in a spec to be imported during adapter resolution.
Examples
>>> from rlmesh.numpy import Model >>> result = Model(lambda observation: 0).run("127.0.0.1:5555", seeds=[0]) >>> result.mean_reward 0.0
- __init__(source, *, spec=None, on_reset=None, on_episode_end=None, on_close=None, artifacts=(), load_kwargs=None, trust_entrypoints=False)[source]¶
- Parameters:
source (Callable[..., object] | object)
spec (object | None)
on_reset (LifecycleCallback | None)
on_episode_end (LifecycleCallback | None)
on_close (LifecycleCallback | None)
artifacts (Sequence[ArtifactInput])
load_kwargs (Mapping[str, object] | None)
trust_entrypoints (bool)
- Return type:
None
- run(env_or_address, *, seeds=None, max_episodes=None, instruction=None, close_env=False, token='')[source]¶
Drive this model against an env and return a
RunResult.Resolves the adapter from the env’s tags and this model’s spec, then runs a per-episode loop.
seedsgives a per-episode seed and sets the episode count unlessmax_episodesis given;instructionis written into the model’s text inputs each episode.env_or_addressis an env object exposingreset/step(e.g. aRemoteEnv), an object with anaddress, or a bare address string the loop dials.- Parameters:
env_or_address (object)
seeds (Sequence[int] | None)
max_episodes (int | None)
instruction (str | None)
close_env (bool)
token (str)
- Return type:
RunResult
- serve(address, *, token='', options=None)[source]¶
Host this model as an endpoint (blocking).
A spec’d model resolves its adapter per route from the env contract the
configure_routehandshake delivers, then applies it around predict; a spec-less /DELEGATEDmodel serves its own predict directly.- Parameters:
address (str)
token (str)
options (ServeOptions | None)
- Return type:
None
Concrete Models¶
Concrete backend model classes inherit ModelBase and only change value conversion:
Class |
Import |
Observation type |
Action encoding |
|---|---|---|---|
Native model |
|
RLMesh-native values and primitives |
RLMesh-native values |
NumPy model |
|
NumPy arrays, primitives, and containers |
NumPy arrays and primitives |
Torch model |
|
Torch tensors, primitives, and containers |
Torch tensors and primitives |
JAX model |
|
JAX arrays, primitives, and containers |
JAX arrays and primitives |