# MirrorNeuron API (/docs/api)


# MirrorNeuron API [#mirrorneuron-api]

This document describes the read and control APIs that the CLI tools currently consume.

The goal is to keep these shapes stable enough for future tools such as:

* terminal monitors
* lightweight web dashboards
* automation hooks
* external operational scripts

## HTTP REST API (New) [#http-rest-api-new]

MirrorNeuron runs an embedded HTTP server (powered by Bandit and Plug) offering a clean, RESTful API. This is inspired by modern resource-oriented principles (similar to Apache Airflow's REST API) but remains simpler, JSON-first, and tightly coupled to MirrorNeuron's lightweight multi-agent engine.

By default, the API binds to port `4000`. You can change this using the `MN_API_PORT` environment variable.

### Base URL [#base-url]

`/api/v1`

### Endpoints Overview [#endpoints-overview]

| Method   | Endpoint                                     | Description                                                                        |
| -------- | -------------------------------------------- | ---------------------------------------------------------------------------------- |
| GET      | `/api/v1/health`                             | Simple liveness check                                                              |
| GET      | `/api/v1/system/summary`                     | Returns cluster nodes and active job overview                                      |
| POST     | `/api/v1/jobs`                               | Submits a new job by providing a manifest in JSON format                           |
| GET      | `/api/v1/jobs`                               | Lists all jobs (supports pagination/filtering)                                     |
| GET      | `/api/v1/jobs/:job_id`                       | Returns detailed state of a running/completed job                                  |
| POST     | `/api/v1/jobs/:job_id/cancel`                | Cancels a running job                                                              |
| POST     | `/api/v1/jobs/cleanup`                       | Clears finished/cancelled jobs from the datastore                                  |
| GET      | `/api/v1/jobs/:job_id/events`                | Returns raw event history for a job                                                |
| GET      | `/api/v1/jobs/:job_id/workflow-progress`     | Returns normalized progress, failure, trace, and observability summary             |
| GET      | `/api/v1/runs/:run_id/artifacts`             | Lists run artifacts with stable IDs, size, hash, content type, and URL             |
| GET      | `/api/v1/runs/:run_id/timeline`              | Returns `mn.timeline.v1` timeline records                                          |
| GET      | `/api/v1/runs/:run_id/observability-summary` | Returns compact `mn.observability_summary.v1` run summary                          |
| POST     | `/api/v1/bundles/:bundle_id/reload`          | Manually reload a registered job bundle                                            |
| GET      | `/api/v1/resource`                           | Core resource totals and configured CPU/GPU/memory/disk limits                     |
| POST/PUT | `/api/v1/resource`                           | Set CPU/GPU/memory/disk limits                                                     |
| POST     | `/api/v1/schedules`                          | Create a runtime schedule from manifest JSON, payloads, or an uploaded bundle path |
| GET      | `/api/v1/schedules`                          | List schedules, optionally filtered by kind or status                              |
| GET      | `/api/v1/schedules/:schedule_id`             | Get one schedule                                                                   |
| PATCH    | `/api/v1/schedules/:schedule_id`             | Update schedule attributes                                                         |
| POST     | `/api/v1/schedules/:schedule_id/pause`       | Pause a schedule                                                                   |
| POST     | `/api/v1/schedules/:schedule_id/resume`      | Resume a schedule                                                                  |
| DELETE   | `/api/v1/schedules/:schedule_id`             | Delete a schedule                                                                  |
| POST     | `/api/v1/schedules/:schedule_id/dispatch`    | Dispatch a schedule immediately                                                    |
| POST     | `/api/v1/events`                             | Emit a runtime trigger event                                                       |
| GET      | `/api/v1/events`                             | List recent runtime trigger events                                                 |

### Design Decisions & Differences from Airflow [#design-decisions--differences-from-airflow]

* **Simplicity over Ceremony**: Airflow's REST API is heavy and enterprise-oriented. MirrorNeuron's API is lean, using standard query parameters, and maps directly to internal monitor boundaries.
* **Explicit Status Fields**: The `status` field drives logic directly (e.g., `pending`, `running`, `queued`, `completed`, `failed`, `cancelled`).
* **Control Plane Separation**: The HTTP layer is merely a translation boundary into internal Elixir primitives and does no business logic itself.

## gRPC And SDK Operator Surfaces [#grpc-and-sdk-operator-surfaces]

The CLI and Python SDK primarily use gRPC. The gRPC server exposes JSON-safe methods for the newer orchestration features:

| Area                  | Surface                                                                                     |
| --------------------- | ------------------------------------------------------------------------------------------- |
| Reconciliation        | `ReconcileNode`                                                                             |
| Drain and maintenance | `DrainNode`, `CancelNodeDrain`, `SetNodeMaintenance`, `GetNodeDrainStatus`                  |
| Services              | `ListServices`, `ResolveService`, `CheckServices`                                           |
| Resources             | resource get/set methods                                                                    |
| Deployments           | deploy, update, list, status, promote, rollback, pause, resume, fail methods                |
| Schedules and events  | create, update, list, status, pause, resume, delete, dispatch, emit, and list event methods |

Implementation entry points:

* `MirrorNeuron/lib/mirror_neuron.ex`
* `MirrorNeuron/lib/mirror_neuron_grpc/server.ex`
* `mn-python-sdk/mn_sdk/client.py`
* `mn-cli/mn_cli/main.py`

### API Examples [#api-examples]

#### 1. System Health [#1-system-health]

```bash
curl -s http://localhost:4000/api/v1/health
```

**Response (200 OK):**

```json
{
  "status": "ok"
}
```

#### 2. System Summary [#2-system-summary]

```bash
curl -s http://localhost:4000/api/v1/system/summary
```

**Response (200 OK):**

```json
{
  "nodes": [
    {
      "name": "mn1@192.168.4.183",
      "connected_nodes": ["mn1@192.168.4.183"],
      "self?": true,
      "scheduler_hint": "cluster_member",
      "executor_pools": {
        "default": { "capacity": 2, "available": 1, "in_use": 1, "queued": 0, "active": 1 }
      }
    }
  ],
  "jobs": [
    {
      "job_id": "prime_sweep_40_workers-...",
      "status": "running"
    }
  ]
}
```

#### 3. Submit a Job [#3-submit-a-job]

Provide a fully resolved JSON manifest.

```bash
curl -X POST http://localhost:4000/api/v1/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "manifest_version": "1.0",
    "graph_id": "simple",
    "entrypoints": ["router"],
    "nodes": [
      {
        "node_id": "router",
        "agent_type": "router",
        "role": "root_coordinator"
      }
    ]
  }'
```

**Response (201 Created):**

```json
{
  "id": "simple-12345...",
  "status": "pending"
}
```

#### 4. List Jobs [#4-list-jobs]

Accepts standard query parameters:

* `limit=20` (default unlimited)
* `include_terminal=false` (default true)

```bash
curl -s "http://localhost:4000/api/v1/jobs?limit=5"
```

**Response (200 OK):**

```json
{
  "data": [
    {
      "job_id": "prime_sweep_40_workers-...",
      "graph_id": "prime_sweep_40_workers",
      "status": "completed",
      "submitted_at": "2026-03-28T11:00:00.000Z",
      "updated_at": "2026-03-28T11:00:12.000Z"
    }
  ]
}
```

#### 5. Get Job Details [#5-get-job-details]

```bash
curl -s http://localhost:4000/api/v1/jobs/prime_sweep_40_workers-...
```

**Response (200 OK):**

```json
{
  "job": { ... },
  "summary": { ... },
  "agents": [ ... ],
  "recent_events": [ ... ],
  "sandboxes": [ ... ]
}
```

#### 6. Cancel Job [#6-cancel-job]

```bash
curl -X POST http://localhost:4000/api/v1/jobs/prime_sweep_40_workers-.../cancel
```

**Response (200 OK):**

```json
{
  "status": "cancelled",
  "job_id": "prime_sweep_40_workers-..."
}
```

#### 7. Cleanup Jobs [#7-cleanup-jobs]

Clears finished, failed, and cancelled jobs from the datastore. Add `?all=true` to forcibly clear all jobs including currently running ones.

```bash
curl -X POST http://localhost:4000/api/v1/jobs/cleanup
```

**Response (200 OK):**

```json
{
  "deleted_count": 2,
  "deleted_jobs": ["job_1", "job_2"]
}
```

#### 8. Job Events [#8-job-events]

```bash
curl -s http://localhost:4000/api/v1/jobs/prime_sweep_40_workers-.../events
```

**Response (200 OK):**

```json
{
  "data": [
    {
      "timestamp": "2026-03-28T11:00:04.000Z",
      "type": "sandbox_job_completed",
      "agent_id": "prime_worker_0001",
      "payload": { ... }
    }
  ]
}
```

#### 9. Reload Bundle [#9-reload-bundle]

Forces a re-scan and reload of a registered bundle, computing its fingerprint and updating it in memory if any changes occurred.

```bash
curl -X POST http://localhost:4000/api/v1/bundles/prime_sweep_40_workers/reload
```

**Response (200 OK):**

```json
{
  "bundle_id": "prime_sweep_40_workers",
  "changed": true,
  "reloaded": true,
  "previous_fingerprint": "a1b2c3d4...",
  "current_fingerprint": "e5f6g7h8...",
  "reason": "api_request",
  "message": "Bundle reloaded successfully",
  "timestamp": "2026-03-28T11:00:00.000Z"
}
```

## Public Elixir API [#public-elixir-api]

These functions are exposed from [MirrorNeuron](../MirrorNeuron/lib/mirror_neuron.ex).

### Job execution [#job-execution]

#### `MirrorNeuron.validate_manifest(input)` [#mirrorneuronvalidate_manifestinput]

Validates a job bundle folder.

Input:

* `input :: String.t()` path to a job folder

Return:

* `{:ok, bundle}`
* `{:error, reason}`

The bundle includes:

* `root_path`
* `manifest_path`
* `payloads_path`
* `manifest`

#### `MirrorNeuron.run_manifest(input, opts \ [])` [#mirrorneuronrun_manifestinput-opts--]

Submits a job bundle for execution.

Important options:

* `await: boolean`
* `timeout: integer | :infinity`
* `json: boolean`
* `job_bundle: bundle` internal/advanced path

Return:

* `{:ok, job_id}` when `await: false`
* `{:ok, job_id, job}` when `await: true`
* `{:error, reason}`

#### `MirrorNeuron.wait_for_job(job_id, timeout \ :infinity)` [#mirrorneuronwait_for_jobjob_id-timeout--infinity]

Waits for terminal status:

* `completed`
* `failed`
* `cancelled`

Return:

* `{:ok, job_map}`
* `{:error, reason}`

### Inspection [#inspection]

#### `MirrorNeuron.inspect_job(job_id)` [#mirrorneuroninspect_jobjob_id]

Reads the persisted job record from Redis.

Return:

* `{:ok, job_map}`
* `{:error, reason}`

Typical job fields:

```json
{
  "job_id": "prime_sweep_40_workers-...",
  "graph_id": "prime_sweep_40_workers",
  "job_name": null,
  "status": "completed",
  "submitted_at": "2026-03-28T11:00:00.000Z",
  "updated_at": "2026-03-28T11:00:12.000Z",
  "placement_policy": "local",
  "recovery_policy": "local_restart",
  "root_agent_ids": ["dispatcher"],
  "result": {},
  "manifest_ref": {
    "graph_id": "prime_sweep_40_workers",
    "manifest_version": "1.0",
    "manifest_path": "/abs/path/manifest.json",
    "job_path": "/abs/path/job-folder"
  }
}
```

#### `MirrorNeuron.inspect_agents(job_id)` [#mirrorneuroninspect_agentsjob_id]

Reads persisted agent snapshots.

Return:

* `{:ok, [agent_snapshot]}`
* `{:error, reason}`

Typical agent fields:

```json
{
  "agent_id": "prime_worker_0001",
  "agent_type": "executor",
  "type": "map",
  "assigned_node": "mn1@192.168.4.183",
  "processed_messages": 1,
  "mailbox_depth": 0,
  "current_state": {
    "runs": 1,
    "last_result": {
      "sandbox_name": "mirror-neuron-job-...",
      "lease": {
        "lease_id": "...",
        "pool": "default",
        "slots": 1
      }
    }
  },
  "metadata": {
    "paused": false,
    "outbound_edges": ["aggregator"]
  }
}
```

#### `MirrorNeuron.events(job_id)` [#mirrorneuroneventsjob_id]

Reads the Redis-backed append-only event list for the job.

Return:

* `{:ok, [event]}`
* `{:error, reason}`

Typical event fields:

```json
{
  "timestamp": "2026-03-28T11:00:04.000Z",
  "type": "sandbox_job_completed",
  "agent_id": "prime_worker_0001",
  "payload": {
    "sandbox_name": "mirror-neuron-job-...",
    "exit_code": 0,
    "pool": "default"
  }
}
```

#### `MirrorNeuron.inspect_nodes()` [#mirrorneuroninspect_nodes]

Returns cluster node summaries with executor pool stats.

Return:

* `[%{...}]`

Typical fields:

```json
[
  {
    "name": "mn1@192.168.4.183",
    "connected_nodes": ["mn1@192.168.4.183", "mn2@192.168.4.35"],
    "self?": true,
    "scheduler_hint": "cluster_member",
    "executor_pools": {
      "default": {
        "capacity": 2,
        "available": 1,
        "in_use": 1,
        "queued": 0,
        "active": 1
      }
    }
  }
]
```

### Control [#control]

#### `MirrorNeuron.pause(job_id)` [#mirrorneuronpausejob_id]

#### `MirrorNeuron.resume(job_id)` [#mirrorneuronresumejob_id]

#### `MirrorNeuron.cancel(job_id)` [#mirrorneuroncanceljob_id]

#### `MirrorNeuron.send_message(job_id, agent_id, message)` [#mirrorneuronsend_messagejob_id-agent_id-message]

These are the control-plane mutation APIs currently used by the main CLI.

## Monitor API [#monitor-api]

These functions are implemented in [monitor.ex](../MirrorNeuron/lib/mirror_neuron/monitor.ex) and are intended as the stable read model for operational tooling.

### `MirrorNeuron.list_jobs(opts \ [])` [#mirrorneuronlist_jobsopts--]

Returns enriched job summaries.

Supported options:

* `limit: integer`
* `include_terminal: boolean`

Return:

* `{:ok, [job_summary]}`
* `{:error, reason}`

`job_summary` includes:

* `job_id`
* `graph_id`
* `job_name`
* `status`
* `submitted_at`
* `updated_at`
* `placement_policy`
* `recovery_policy`
* `executor_count`
* `active_executors`
* `nodes`
* `sandbox_names`
* `last_event`

### `MirrorNeuron.job_details(job_id, opts \ [])` [#mirrorneuronjob_detailsjob_id-opts--]

Returns the full monitor detail view for one job.

Supported options:

* `event_limit: integer` default `25`

Return:

* `{:ok, details}`
* `{:error, reason}`

`details` includes:

* `job`
* `summary`
* `agents`
* `sandboxes`
* `recent_events`

Each agent entry includes:

* `agent_id`
* `agent_type`
* `type`
* `assigned_node`
* `status`
* `running?`
* `processed_messages`
* `mailbox_depth`
* `paused?`
* `last_error`
* `sandbox_name`
* `lease`

### `MirrorNeuron.cluster_overview(opts \ [])` [#mirrorneuroncluster_overviewopts--]

Convenience call that combines:

* `MirrorNeuron.inspect_nodes/0`
* `MirrorNeuron.list_jobs/1`

Return:

* `{:ok, %{"nodes" => [...], "jobs" => [...]}}`
* `{:error, reason}`

## Redis persistence keys [#redis-persistence-keys]

The current monitor API is backed by these Redis structures in [redis\_store.ex](../MirrorNeuron/lib/mirror_neuron/persistence/redis_store.ex).

Namespace prefix:

* `mirror_neuron` by default
* configurable through `:redis_namespace`

Key shapes:

* `mirror_neuron:jobs`
  * Redis set of known job ids
* `mirror_neuron:job:<job_id>`
  * JSON job record
* `mirror_neuron:job:<job_id>:events`
  * Redis list of JSON events
* `mirror_neuron:job:<job_id>:agents`
  * Redis set of agent ids
* `mirror_neuron:job:<job_id>:agent:<agent_id>`
  * JSON agent snapshot

Pub/Sub channel:

* `mirror_neuron:channel:events:<job_id>`

This event channel is written today but not yet consumed by the terminal monitor. It is the best candidate for future live dashboards.

## Failure Model [#failure-model]

Job details, workflow progress, failure events, and compact summaries expose a shared `failure` object using `mn.error.v1`. Runtime events use `error: mn.error.v1`; SDK/API progress responses normalize that into top-level `failure` plus step and agent `failure` fields. Legacy `reason` and `status_reason` remain for compatibility and should be treated as display strings derived from `failure.desc` when available.

Compact job details and workflow progress also expose `trace_id` and `observability_summary` when a run store exists. The summary uses `mn.observability_summary.v1` and includes status, duration, trace id, event/log/error/warning/timeline/artifact counts, retry count, failed step or agent when known, resource peaks, token totals, and artifact links. The normalized execution timeline is available from `/api/v1/runs/:run_id/timeline` and from the run stream when the `timeline` channel is requested.

Run artifact listings include observability artifacts with stable IDs and download metadata:

* `events_jsonl`, `logs_jsonl`, `errors_jsonl`, `timeline_jsonl`, `timeline_json`, `observability_summary_json`
* Rotated segments such as `events_jsonl_001`, `logs_jsonl_001`, `errors_jsonl_001`

Each artifact entry includes size, SHA-256 hash, content type, and URL. Clients should link to `errors.jsonl`, `events.jsonl`, `logs.jsonl`, and `timeline.jsonl` instead of embedding large log blobs in job detail views.

## Terminal CLI [#terminal-cli]

The user-facing CLI is `mn`.

Common commands:

```bash
mn node list
mn job status <job_id>
mn job monitor <job_id>
mn job cancel <job_id>
```

The CLI uses the Python SDK over gRPC for most control paths. See [CLI Reference](cli.md).

## Stability guidance [#stability-guidance]

For future tools, prefer consuming:

1. `MirrorNeuron.list_jobs/1`
2. `MirrorNeuron.job_details/2`
3. `MirrorNeuron.cluster_overview/1`

Avoid coupling directly to raw Redis keys unless you are building low-level operational tooling.