Mirror Neuron Documents

Resources And Devices

Documentation for Resources And Devices.

Resources And Devices

MirrorNeuron supports a stronger resource model for scheduling AI workers across mixed machines. This model is inspired by Nomad's resources, device, network, and volume ideas, but v1 is scheduling, allocation metadata, and runtime environment hints only.

It does not enforce cgroups, mount host paths, or isolate device access at the OS level.

Design Concept

Each agent can request:

  • scalar capacity: CPU, memory, disk, generic GPU count
  • rich devices: CUDA, Metal, ROCm, vendor, memory, capabilities, device IDs
  • explicit ports
  • host volumes
  • runtime drivers such as host_local or openshell

The scheduler compares those requests against node inventory and active placements. A successful placement records concrete allocation metadata on the job.

Resource Request Shape

{
  "nodes": [
    {
      "node_id": "gpu_worker",
      "agent_type": "executor",
      "resources": {
        "cpu_cores": 2,
        "memory_mb": 8192,
        "disk_mb": 20480,
        "devices": [
          {
            "kind": "gpu",
            "driver": "cuda",
            "vendor": "nvidia",
            "min_memory_mb": 16000,
            "capabilities": ["fp16"],
            "count": 1
          }
        ],
        "ports": [
          {
            "label": "api",
            "port": 8088,
            "protocol": "http"
          }
        ],
        "volumes": [
          {
            "name": "models",
            "source": "/mnt/models",
            "target": "/models",
            "mode": "ro",
            "type": "host"
          }
        ],
        "runtime_driver": "host_local"
      }
    }
  ]
}

Existing manifests remain valid. Legacy gpu_count is treated as a generic GPU device request:

{
  "resources": {
    "gpu_count": 1
  }
}

Device Requests

FieldMeaning
kind or typeUsually gpu; either may be used.
countNumber of matching devices. Defaults to 1.
vendorVendor filter such as nvidia, apple, or amd.
driverDriver/runtime capability such as cuda, metal, or rocm.
min_memory_mbMinimum memory on one device.
capabilitiesRequired capability labels.
idsOptional exact device IDs.

Examples:

CUDA only:

{
  "resources": {
    "devices": [
      {
        "kind": "gpu",
        "driver": "cuda",
        "count": 1
      }
    ]
  }
}

Apple Metal:

{
  "resources": {
    "devices": [
      {
        "kind": "gpu",
        "driver": "metal",
        "vendor": "apple",
        "count": 1
      }
    ]
  }
}

Large memory GPU:

{
  "resources": {
    "devices": [
      {
        "kind": "gpu",
        "min_memory_mb": 24000,
        "count": 1
      }
    ]
  }
}

Ports

Ports are explicit in v1. The scheduler rejects placements that would reserve the same port on the same node.

{
  "resources": {
    "ports": [
      {
        "label": "metrics",
        "port": 9100,
        "protocol": "http"
      }
    ]
  }
}

Supported protocols are tcp, udp, http, and grpc.

Volumes

Volumes are host-path requirements in v1. The scheduler only places a job on a node that advertises or has the requested absolute source path. Core records the allocation and injects environment hints, but it does not mount the path automatically.

{
  "resources": {
    "volumes": [
      {
        "name": "cache",
        "source": "/var/mn-cache",
        "target": "/cache",
        "mode": "rw",
        "type": "host"
      }
    ]
  }
}

Supported modes are ro and rw. Supported type is host.

Runtime Environment Hints

When an agent starts, allocation metadata is passed into its runtime context and safe environment hints:

Env varMeaning
MN_ALLOCATION_JSONFull JSON allocation.
MN_ALLOCATED_DEVICE_IDSComma-separated selected device IDs.
CUDA_VISIBLE_DEVICESSelected CUDA device indices.
MN_GPU_DRIVERSelected GPU drivers.
MN_PORT_<LABEL>Reserved explicit port by label.
MN_VOLUME_<NAME>Allocated host volume source.
MN_VOLUME_<NAME>_TARGETRequested target path.

Worker code should use these hints when selecting devices, ports, and model/cache paths.

Inspect Resources

mn resource list

The response includes per-node scalar totals, combined cluster totals, device inventory, GPU memory totals, runtime drivers, and host path information when available.

Set coarse local resource limits:

mn resource set --cpu 75 --memory 75 --gpu 100 --disk 75

Validation

mn blueprint validate rejects:

  • malformed devices, ports, or volumes
  • negative memory, count, or scalar values
  • duplicate port labels
  • duplicate volume names
  • invalid ports outside 1..65535
  • relative volume source or target paths
  • unsupported volume modes or protocols
  • non-string runtime_driver

Important Code

AreaFiles
Resource shape and env hintsMirrorNeuron/lib/mirror_neuron/resource_spec.ex
Scheduler matching and allocationMirrorNeuron/lib/mirror_neuron/scheduler.ex
Node inventoryMirrorNeuron/lib/mirror_neuron/resource.ex
Admission and limitsMirrorNeuron/lib/mirror_neuron/resource_admission.ex
Runtime agent contextMirrorNeuron/lib/mirror_neuron/runtime/job_coordinator.ex
Manifest validationMirrorNeuron/lib/mirror_neuron/manifest.ex
CLI commandsmn-cli/mn_cli/libs/resource_cmds.py
SDK validation helpersmn-python-sdk/mn_sdk/blueprint_validation.py

V1 Limits

  • no dynamic port allocation
  • no automatic volume mounting
  • no hard device isolation
  • no global executor lease balancer yet
  • active placement metadata is used to avoid double-booking devices and ports, but host OS enforcement remains the runner's responsibility

On this page