Mirror Neuron Documents

Services and Health Checks

Documentation for MirrorNeuron's native Redis-backed service registry and health check layer.

Services And Health Checks

MirrorNeuron has a native Redis-backed service registry and generic health check layer. It is inspired by Nomad's service and check blocks: jobs declare services, checks decide whether service instances are discoverable, and registration follows job or agent lifecycle.

Service discovery remains generic: blueprints and skills declare service endpoints and health checks for Ollama, vLLM, vector databases, and provider-specific services. Local Docker Model Runner LLMs are managed separately with mn model; see Model Runtime.

Design Concept

Service support has three parts:

  • services and nodes[].services declare what a job or agent provides
  • required_services and nodes[].requires_services declare what must exist before the job starts or before a node can be selected
  • health checks mark service instances passing, warning, or critical

Only passing services are returned by discovery by default. Critical services stay in the registry for inspection but are not normal routing targets.

Manifest Fields

Top-level fields:

{
  "services": [],
  "required_services": []
}

Node-level fields:

{
  "nodes": [
    {
      "node_id": "agent_api",
      "services": [],
      "requires_services": []
    }
  ]
}

Use top-level required_services for external or cluster-wide requirements that must pass before the job starts. Use node-level requires_services when placement should target a node that already has a healthy matching local service.

Service Declaration

{
  "name": "ollama",
  "id": "ollama-local",
  "address": "${config.ollama.host}",
  "port": "${config.ollama.port}",
  "tags": ["llm", "local"],
  "meta": {
    "model_family": "qwen"
  },
  "provider": "mirror_neuron",
  "origin": "external",
  "checks": [
    {
      "name": "http-health",
      "type": "http",
      "url": "${config.ollama.api_base}/api/tags",
      "method": "GET",
      "expected_status": 200,
      "timeout_ms": 2000,
      "interval_ms": 10000,
      "required": true,
      "failures_before_critical": 1
    }
  ]
}

Supported service fields:

FieldMeaning
nameRequired service name.
idOptional stable instance id. Defaults from job, agent, and service name.
addressHost or IP. Supports templates.
portExplicit port. Supports templates.
tagsDiscovery filters.
metaJSON metadata stored with the service.
providermirror_neuron in v1.
origininternal for runtime registered services, external for dependencies outside the job.
checksHTTP, TCP, script, or gRPC check declarations.

Templates supported in string fields include:

  • ${config.llm.api_base}
  • ${env.MN_LLM_API_BASE}
  • ${node}
  • ${job_id}
  • ${agent_id}
  • ${service.address}
  • ${service.port}

Blueprint config comes from config/default.json, config/overwrite.json, and runtime overrides when available.

Check Types

HTTP:

{
  "type": "http",
  "path": "/health",
  "address": "127.0.0.1",
  "port": 11434,
  "method": "GET",
  "expected_status": 200,
  "contains": "ok"
}

TCP:

{
  "type": "tcp",
  "address": "127.0.0.1",
  "port": 6379
}

Script:

{
  "type": "script",
  "command": ["python3", "payloads/check_model_cache.py"],
  "timeout_ms": 5000
}

Script commands run without shell expansion unless the blueprint explicitly demands a shell as the command. Validation rejects unsafe single-string command shapes with shell metacharacters.

gRPC:

{
  "type": "grpc",
  "address": "127.0.0.1",
  "port": 50051,
  "service": "grpc.health.v1.Health"
}

Validation And Preflight

mn blueprint validate <bundle> validates service declarations after manifest/schema checks and before input validation. mn blueprint run --folder <build> runs the same local preflight, and core repeats service preflight before direct runtime starts.

Failed required services stop the job before agents launch. A forced run can skip service preflight, and job metadata records the skipped check.

Run service checks directly:

mn service check /path/to/bundle
mn service check /path/to/bundle --output json

Discovery Commands

List passing services:

mn service list

Include warning and critical services:

mn service list --all

Resolve one service:

mn service resolve ollama --tag llm

Filter by node:

mn service resolve vllm --node mirror_neuron@192.168.4.20

Blueprint Web UI Services

Blueprints keep using their existing config.web_ui contract. For live/service blueprints with web_ui.output.adapter: "gradio", launch preparation injects a runtime-managed web_ui_dashboard HostLocal agent and registers its dashboard as a service:

{
  "name": "blueprint-web-ui",
  "tags": ["web_ui", "blueprint", "<blueprint_id>", "gradio"],
  "meta": {
    "run_id": "<run_id>",
    "blueprint_id": "<blueprint_id>",
    "url": "http://localhost:58000",
    "adapter": "gradio"
  }
}

The generated service reserves an explicit HTTP port from MN_BLUEPRINT_WEB_UI_PORT_START/MN_BLUEPRINT_WEB_UI_PORT_END and includes an HTTP readiness check. Discovery returns it as passing only after the Gradio dashboard is reachable. Runtime dashboards read live events through the mn-api run events endpoint when the run store is outside the Core container, and fall back to events.jsonl when no API event source is configured. The dashboard still writes ui.json and web_ui.json under the run store for older OtterDesk and CLI consumers, but the service registry is the authoritative live-service source.

Runtime Behavior

  • job-level services register when the job starts
  • agent-level services register when the agent starts
  • the service monitor refreshes checks periodically
  • failed checks mark instances critical after failures_before_critical
  • discovery hides non-passing instances by default
  • agent-scoped services deregister when an agent stops, is rescheduled, or the job is cancelled
  • deployment metadata is attached to service instances so canary or candidate versions can be hidden until promotion

Important Code

AreaFiles
Manifest service shapeMirrorNeuron/lib/mirror_neuron/service_spec.ex
One-shot checksMirrorNeuron/lib/mirror_neuron/service_check.ex
PreflightMirrorNeuron/lib/mirror_neuron/service_preflight.ex
RegistryMirrorNeuron/lib/mirror_neuron/service_registry.ex
Periodic monitorMirrorNeuron/lib/mirror_neuron/service_monitor.ex
Job registration and deregistrationMirrorNeuron/lib/mirror_neuron/runtime/job_coordinator.ex
Redis storageMirrorNeuron/lib/mirror_neuron/persistence/redis_store.ex
Scheduler node-scoped requirementsMirrorNeuron/lib/mirror_neuron/scheduler.ex
CLI commandsmn-cli/mn_cli/libs/service_cmds.py
Blueprint validationmn-python-sdk/mn_sdk/blueprint_validation.py
SDK clientmn-python-sdk/mn_sdk/client.py

unlink(content/docs/md-legacy/services-and-health-checks.md)

On this page