Operation module stable¶
Purpose & Scope¶
The Operation module models one execution of an episodic operational task: bakeout, calibration sweep, optical alignment, beam-mode change, recovery procedure, ID maintenance, KB switching. Operators register a Procedure, start it, append per-step records (setpoint applied, action performed, check verified), then close it via complete, abort, or truncate. Both instrument-level and facility-envelope procedures share this aggregate.
A Procedure is distinct from a Run: a Run executes one Plan against a Subject through the experiment lifecycle (ISA-88 batch lens); a Procedure executes one episodic task that may or may not be bound to a Run (ISA-106 lens). When parent_run_id is set, the Procedure is a Phase-of-Run (calibration sweep invoked mid-Run); when None, it stands alone (bakeout run between Runs).
Out of scope
- Held / Resumed transitions. No pause-and-resume cycle today. The pilot will surface whether operators need it; the additive-state pattern keeps the door open.
- Verifying as a first-class FSM state. Per-step Check happens inside Running synchronously; the standards corpus does not bless a separate Verifying state.
- Per-kind payload validation at the API. The step
payloadbody isdict[str, Any]today; per-kind Pydantic models land once pilot vocabulary settles. - Asset-existence verification at register time.
target_asset_idsis taken at face value; existence and decommission-state gating runs at start-procedure time.
Aggregates¶
| Name | Identity | State summary | FSM |
|---|---|---|---|
Procedure |
id: UUID (opaque) |
name, kind, target asset ids, status, optional parent_run_id, optional steps_logbook_id, optional capability_id |
yes |
Value Objects¶
| Name | Shape | Where used |
|---|---|---|
ProcedureName |
trimmed bounded text, 1-200 chars | Procedure.name |
ProcedureAbortReason |
trimmed bounded text, 1-500 chars; decider-input only | abort_procedure body |
ProcedureTruncateReason |
trimmed bounded text, 1-500 chars; decider-input only | truncate_procedure body |
ProcedureStatus |
closed StrEnum {Defined, Running, Completed, Aborted, Truncated} |
Procedure.status |
StepKind |
closed Literal["setpoint", "action", "check"] |
per-step entry rows |
Procedure.kind is a bare str (1-50 chars, validated at the decider) rather than a VO, mirroring the Supply.kind precedent: pilot vocabulary will settle and the field will graduate to a closed ProcedureKind StrEnum later. Documented starter vocabulary: bakeout, calibration, alignment, recovery, beam_mode_change, id_maintenance, kb_switching, optical_alignment, vacuum_regeneration.
FSM¶
stateDiagram-v2
[*] --> Defined: register_procedure
Defined --> Running: start_procedure
Running --> Completed: complete_procedure
Running --> Aborted: abort_procedure
Running --> Truncated: truncate_procedure
Completed --> [*]
Aborted --> [*]
Truncated --> [*]
| From | To | Command | Event |
|---|---|---|---|
[*] |
Defined |
register_procedure |
ProcedureRegistered |
Defined |
Running |
start_procedure |
ProcedureStarted |
Running |
Completed |
complete_procedure |
ProcedureCompleted |
Running |
Aborted |
abort_procedure |
ProcedureAborted |
Running |
Truncated |
truncate_procedure |
ProcedureTruncated |
Guards. Beyond the source-state check, each transition enforces:
start_procedure- Re-loads every target Asset and refuses to start if any are Decommissioned, raising
ProcedureAssetDecommissionedError. Bound Capability (whencapability_idis set) must listProcedurein itsexecutor_shapes, otherwiseProcedureCapabilityExecutorMismatchError. abort_procedure/truncate_procedurereasonis REQUIRED, trimmed, 1-500 chars.truncate_procedureaccepts an optionalinterrupted_at(operator's best guess at the actual interruption time); validated to be not later thannow.append_procedure_step- Status must be
Running. Appending to aDefined,Completed,Aborted, orTruncatedProcedure raisesProcedureStepsLogbookClosedError(the steps logbook is implicitly closed on every terminal).step_kindmust be one ofsetpoint,action,check. Producer-suppliedevent_iddeduplicates retries silently viaON CONFLICT (event_id) DO NOTHING.
Events¶
| Event | Payload sketch | When emitted |
|---|---|---|
ProcedureRegistered |
procedure_id, name, kind, target_asset_ids, parent_run_id?, capability_id?, occurred_at |
register_procedure accepted; status implicitly Defined. |
ProcedureStarted |
procedure_id, occurred_at |
start_procedure accepted (Defined → Running). |
ProcedureStepsLogbookOpened |
procedure_id, logbook_id, kind="steps", schema, occurred_at |
First append_procedure_step call for the Procedure (lazy open). |
ProcedureCompleted |
procedure_id, occurred_at |
complete_procedure accepted (Running → Completed). |
ProcedureAborted |
procedure_id, reason, occurred_at |
abort_procedure accepted (Running → Aborted). |
ProcedureTruncated |
procedure_id, reason, interrupted_at?, occurred_at |
truncate_procedure accepted (Running → Truncated). |
Per-step records (one row per setpoint, action, or check) write directly to the entries_operation_procedure_steps table via the StepStore port, NOT as events on the Procedure stream. No ProcedureStepsLogbookClosed event is emitted; the FSM terminal IS the close signal.
Slices¶
| Command | Category | REST | MCP tool | Idempotency |
|---|---|---|---|---|
RegisterProcedure |
NEW | POST /procedures |
register_procedure |
required |
StartProcedure |
MODIFIED | POST /procedures/{procedure_id}/start |
start_procedure |
none |
AppendProcedureStep |
MODIFIED | POST /procedures/{procedure_id}/steps |
append_procedure_step |
producer-supplied event_id per entry |
CompleteProcedure |
MODIFIED | POST /procedures/{procedure_id}/complete |
complete_procedure |
none |
AbortProcedure |
MODIFIED | POST /procedures/{procedure_id}/abort |
abort_procedure |
none |
TruncateProcedure |
MODIFIED | POST /procedures/{procedure_id}/truncate |
truncate_procedure |
none |
GetProcedure |
QUERY | GET /procedures/{procedure_id} |
get_procedure |
none |
ListProcedures |
QUERY | GET /procedures |
list_procedures |
none |
Errors per slice. Beyond Pydantic boundary 422s, each slice raises:
RegisterProcedureProcedureAlreadyExistsError,InvalidProcedureNameError,InvalidProcedureKindError,UnauthorizedStartProcedureProcedureNotFoundError,ProcedureCannotStartError,ProcedureAssetDecommissionedError,ProcedureCapabilityExecutorMismatchError,UnauthorizedAppendProcedureStepProcedureNotFoundError,ProcedureStepsLogbookClosedError,InvalidStepKindError,UnauthorizedCompleteProcedure/AbortProcedure/TruncateProcedureProcedureNotFoundError,ProcedureCannot<Verb>Error(single-source fromRunning),Unauthorized. Abort additionally raisesInvalidProcedureAbortReasonError; Truncate additionally raisesInvalidProcedureTruncateReasonErrorandInvalidProcedureInterruptedAtError.GetProcedureProcedureNotFoundErrorListProcedures- (boundary 422 only)
Storage & Projections¶
proj_operation_procedure_summary:
CREATE TABLE proj_operation_procedure_summary (
procedure_id UUID PRIMARY KEY,
name TEXT NOT NULL,
kind TEXT NOT NULL,
target_asset_ids UUID[] NOT NULL DEFAULT '{}',
parent_run_id UUID,
status TEXT NOT NULL CHECK (
status IN ('Defined', 'Running', 'Completed', 'Aborted', 'Truncated')
),
steps_logbook_id UUID,
registered_at TIMESTAMPTZ NOT NULL,
last_status_changed_at TIMESTAMPTZ,
last_status_reason TEXT,
interrupted_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX proj_operation_procedure_summary_keyset_idx
ON proj_operation_procedure_summary (registered_at, procedure_id);
CREATE INDEX proj_operation_procedure_summary_target_assets_gin_idx
ON proj_operation_procedure_summary USING GIN (target_asset_ids);
last_status_changed_at updates on every transition out of Defined; last_status_reason is populated by Aborted and Truncated only (Completed is happy-path, no reason). interrupted_at is Truncated-only and carries the operator's best guess at when the actual interruption happened (distinct from last_status_changed_at, which is when the truncate command was processed). steps_logbook_id is NULL until the first step is appended and is set by ProcedureStepsLogbookOpened independently of any lifecycle transition.
entries_operation_procedure_steps:
CREATE TABLE entries_operation_procedure_steps (
event_id uuid PRIMARY KEY,
procedure_id uuid NOT NULL,
logbook_id uuid NOT NULL,
actor_id uuid NOT NULL,
command_name text NOT NULL,
step_kind text NOT NULL,
payload jsonb NOT NULL,
sampled_at timestamptz NOT NULL,
occurred_at timestamptz NOT NULL,
correlation_id uuid NOT NULL,
causation_id uuid,
recorded_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX entries_operation_procedure_steps_proc_sampled_idx
ON entries_operation_procedure_steps (procedure_id, sampled_at DESC);
CREATE INDEX entries_operation_procedure_steps_proc_kind_sampled_idx
ON entries_operation_procedure_steps (procedure_id, step_kind, sampled_at DESC);
CREATE INDEX entries_operation_procedure_steps_logbook_idx
ON entries_operation_procedure_steps (logbook_id);
CREATE INDEX entries_operation_procedure_steps_recorded_at_brin_idx
ON entries_operation_procedure_steps USING BRIN (recorded_at);
REVOKE UPDATE, DELETE, TRUNCATE ON entries_operation_procedure_steps FROM cora_app;
Polymorphic-with-discriminator: one row per step, with step_kind discriminating between setpoint, action, and check, and the per-kind body shape carried in the payload jsonb column. The table is append-only at the role level (UPDATE / DELETE / TRUNCATE revoked); event_id is the producer-supplied UUIDv7 idempotency key, so retrying a step submission with the same id is a silent no-op via ON CONFLICT (event_id) DO NOTHING. Three timestamps are recorded per entry: sampled_at (when the step physically happened in the field), occurred_at (when the handler processed the append), and recorded_at (when Postgres wrote the row).
Cross-Module boundaries¶
| Module | Relationship | What's exchanged |
|---|---|---|
Equipment |
reads-from | target_asset_ids references Asset aggregates. Existence and Decommissioned-lifecycle gating runs at start_procedure time via ProcedureStartContext, NOT at register-time. |
Run |
reads-from | Optional parent_run_id resolves the Phase-of-Run question: a Procedure with parent_run_id set is a Phase invoked mid-Run; None is a standalone Procedure. The Operation module does not validate the reference. |
Recipe |
reads-from | Optional capability_id binds a Procedure to the universal Capability template. The bound Capability must list Procedure in its executor_shapes, enforced at start_procedure. |
Examples¶
The four examples below follow the canonical Procedure path: register a calibration sweep targeting one Asset, start it, append one setpoint step + one check step, then complete it. The append_procedure_step slice carries producer-supplied event_id per entry for safe retries (Idempotency-Key is not used at this slice). For the REST/MCP equivalence, auth, and idempotency conventions these examples share, see Reading the examples on the Modules landing page.
Register a Procedure¶
POST /procedures
Content-Type: application/json
Idempotency-Key: 9a7d2c3e-4b1f-4f6a-8a2e-5c2c4f3a7b91
X-Principal-Id: 7b1f2d4e-2a3c-4d5e-8f9a-1b2c3d4e5f60
{
"name": "Beamline 35-BM rotary stage calibration sweep",
"kind": "calibration",
"target_asset_ids": ["c1f2d3c4-b5a6-4978-8869-7a6b5c4d3e2f"]
}
A successful call returns 201 Created with {"procedure_id": "<uuid>"}. The Procedure starts in Defined.
Start the Procedure¶
A successful call returns 204 No Content. Status moves to Running; the handler pre-loads each target Asset and refuses to start if any are Decommissioned.
Append a setpoint and a check step¶
POST /procedures/{procedure_id}/steps
Content-Type: application/json
X-Principal-Id: 7b1f2d4e-2a3c-4d5e-8f9a-1b2c3d4e5f60
{
"entries": [
{
"event_id": "0190f001-aaaa-7000-8000-000000000001",
"step_kind": "setpoint",
"payload": {
"channel": "rotary.theta",
"target_value": 90.0,
"units": "deg",
"ramp_rate": 5.0
},
"sampled_at": "2026-05-20T14:32:11Z"
},
{
"event_id": "0190f001-aaaa-7000-8000-000000000002",
"step_kind": "check",
"payload": {
"channel": "rotary.theta",
"expected": 90.0,
"actual": 89.998,
"tolerance": 0.01,
"passed": true
},
"sampled_at": "2026-05-20T14:32:18Z"
}
]
}
A successful call returns 200 OK with {"event_count": 2}. The first call also emits ProcedureStepsLogbookOpened on the Procedure stream (lazy open). Re-issuing the same event_id values silently dedupes via ON CONFLICT (event_id) DO NOTHING.
mcp.call_tool(
"append_procedure_step",
{
"procedure_id": "<uuid>",
"entries": [
{
"event_id": "0190f001-aaaa-7000-8000-000000000001",
"step_kind": "setpoint",
"payload": {
"channel": "rotary.theta",
"target_value": 90.0,
"units": "deg",
"ramp_rate": 5.0,
},
"sampled_at": "2026-05-20T14:32:11Z",
},
{
"event_id": "0190f001-aaaa-7000-8000-000000000002",
"step_kind": "check",
"payload": {
"channel": "rotary.theta",
"expected": 90.0,
"actual": 89.998,
"tolerance": 0.01,
"passed": True,
},
"sampled_at": "2026-05-20T14:32:18Z",
},
],
},
)
Returns the same response shape as the REST call.
Complete the Procedure¶
A successful call returns 204 No Content. Status moves to Completed; the steps logbook is implicitly closed (subsequent append_procedure_step calls return 409 Conflict).