Multiple Endpoints

Run more than one environment endpoint, then connect one evaluator to all of them.

The runnable evaluator is examples/python/quickstart/eval_many.py.

Start Two Servers

Each environment is served by rlmesh.EnvServer. The Gymnasium server wraps a registered environment:

import gymnasium as gym
from rlmesh import EnvServer

env = gym.make(args.env_id)
server = EnvServer(env, args.address)
server.serve()

The custom server wraps a plain Python object with the same shape:

import rlmesh

server = rlmesh.EnvServer(CounterEnv(), args.address)
server.serve()

Terminal one owns Gymnasium CartPole-v1, terminal two owns the small custom CounterEnv:

uv run python examples/python/quickstart/serve_gymnasium.py --address 127.0.0.1:5555
uv run python examples/python/quickstart/serve.py --address 127.0.0.1:5556

Evaluate Both

eval_many.py opens a RemoteEnv per address and runs the same sampled-action loop against each endpoint:

def evaluate(address: str, max_steps: int) -> str:
    from rlmesh.numpy import RemoteEnv

    env = RemoteEnv(address)
    try:
        lines = [f"{address}: connected"]
        obs, info = env.reset(seed=0)
        for step in range(1, max_steps + 1):
            action = env.action_space.sample()
            obs, reward, term, trunc, info = env.step(action)
            lines.append(f"{address}: step={step} reward={reward:.3f}")
            if term or trunc:
                lines.append(f"{address}: episode complete")
                break
        else:
            lines.append(f"{address}: stopped after {max_steps} steps")
        return "\n".join(lines)
    finally:
        env.close()

The addresses are passed in and each one is evaluated on its own thread:

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=len(args.addresses)) as executor:
    futures = [
        executor.submit(evaluate, address, args.max_steps) for address in args.addresses
    ]
    for future in futures:
        print(future.result())

Run it in terminal three:

uv run python examples/python/quickstart/eval_many.py \
  127.0.0.1:5555 \
  127.0.0.1:5556

That is one evaluator running across multiple environment runtimes, locally. The client shape is the same for every endpoint, whether the server wraps a Gymnasium environment or a custom object.