Caution module stable¶
Purpose & Scope¶
The Caution module records operator-authored tribal-knowledge notes about equipment quirks and procedural gotchas. A Caution is the digital twin of the kind of warning that historically lived in a lab notebook or a Slack thread: "the hexapod stalls below 0.5 mm/s after thermal-soak finishes; wait 10 minutes or override the soak"; "do not skip the encoder home before tomography or the rotation center walks during long scans"; "channel 3 reads 0.2 mA low after the last calibration; subtract before plotting".
Cautions are lightweight and operator-owned. They do not gate work the way Safety Clearances do; instead they show up as banners at the moment work is about to start, so the operator about to launch a Run sees what the previous shift learned the hard way.
A Caution carries five roles:
- Identity for one operator note, stable across supersession chains. The Caution id is the internal opaque handle; there is no facility-minted external id today because cautions are internal operational artifacts.
- A lightweight finite lifecycle with a three-state machine: a Caution is Active when authored, can be Superseded by a newer Caution carrying revised text or workaround, or Retired when no longer applicable. There is no Approved / Review / Rejected ceremony; the author is the operator who saw the issue.
- A polymorphic target. A Caution attaches to either an Asset or a Procedure today. Run-level and Subject-level targets are deferred to a later iteration.
- Structured payload. Each Caution carries a closed Category (one of six), a closed Severity (one of three, ANSI Z535-downshifted), a free-text body, a REQUIRED free-text workaround, free tags, an optional expiry, and an opt-in flag for propagation down the Asset hierarchy.
- Non-blocking read-side surfaces. The Asset and Procedure detail pages eager-load the Active Cautions for the resource.
GET /cautionslists them with status and target filters.Run.startqueries Active Cautions covering the Run's scope and surfaces them as a banner on the response, but never refuses to start.
Out of scope
- Blocking authority. Cautions never refuse to start a Run or Procedure. Anything that needs to actually stop work belongs in the Safety module as a Clearance.
- In-place text edit. There is no
update_cautionslice. The edit path issupersede_caution, which atomically retires the parent and writes a new child Caution with revised text. Lineage and audit are preserved. - Per-operator acknowledgement state. A Caution itself does not track which operator has acknowledged it. Acks live on the consumption event (the Run that consumed the caution at start time), not on the Caution aggregate.
- Promotion to a formal Clearance. A Warning-severity Caution that accumulates evidence is a candidate for promotion to a Safety Clearance, but today this is a hint only; there is no slice that performs the promotion.
- Additional target kinds. RunTarget and SubjectTarget are not modelled today. Runs are short-lived; a note about one is rarely actionable. Subject-level hazards overlap with the Subject module's own hazard field. Both are future-additive.
Aggregates¶
| Name | Identity | State summary | FSM |
|---|---|---|---|
Caution |
id: UUID |
target, category, severity, text, workaround, author_actor_id, tags, expires_at?, propagate_to_children, status, parent_caution_id?, superseded_by_caution_id?, retired_reason? |
yes |
parent_caution_id is populated only on a Caution that supersedes a prior one. superseded_by_caution_id is populated only on a Caution that has been replaced. The two pointers together form the lineage chain that downstream readers walk to find the current head of a quirk's history.
propagate_to_children is an explicit opt-in: when True, the projection walks Asset.parent_id downward at query time so a Caution on an Assembly surfaces on every Device under it.
Value Objects¶
| Name | Shape | Where used |
|---|---|---|
CautionText |
trimmed string, 1–2000 chars | Caution.text (the issue description) |
CautionWorkaround |
trimmed string, 1–2000 chars; REQUIRED | Caution.workaround (the operator-actionable mitigation) |
CautionTag |
trimmed string, 1–50 chars | members of Caution.tags (free vocabulary for facility-specific drift) |
CautionTarget |
2-arm discriminated union: AssetTarget(asset_id) | ProcedureTarget(procedure_id) |
Caution.target |
The workaround field is mandatory by aggregate invariant. A Caution with no actionable mitigation is just noise; the field's presence is the single strongest convention across operator-knowledge systems.
CautionTarget is fixed at registration. A supersede may revise text, workaround, severity, tags, expiry, and propagate-to-children, but not the target; retargeting via supersede confuses the read-side projection's "active cautions on Asset X" query. To move a caution to a different target, retire the original and register a new one.
FSM¶
stateDiagram-v2
[*] --> Active: register_caution
Active --> Superseded: supersede_caution
Active --> Retired: retire_caution
Superseded --> [*]
Retired --> [*]
| From | To | Command | Event |
|---|---|---|---|
(none) |
Active |
register_caution |
CautionRegistered |
Active |
Superseded |
supersede_caution |
CautionSuperseded (parent) + CautionRegistered (child) |
Active |
Retired |
retire_caution |
CautionRetired |
Guards. Beyond the source-state check, each transition enforces:
register_cautiontextandworkaroundmust each be 1–2000 chars after trim (workaround is REQUIRED, not nullable); eachtagintagsmust be 1–50 chars;targetdiscriminator must beAssetorProcedure; ifexpires_atis set it must be in the future relative tooccurred_at.supersede_caution- Parent must be
Active; cannot supersede aRetiredor already-Supersededparent (start a new Caution instead). Child Caution fields validated perregister_caution. The child'stargetMUST equal the parent'starget(preserving the read-side projection's target-stability invariant across lineage chains). retire_caution- Source must be
Active.reasonis a closed enum:Resolved(issue fixed),NoLongerApplies(situation changed), orWrongTarget(caution should never have been written for this target).
The authoring actor is carried on the event envelope (StoredEvent.principal_id); the aggregate state additionally denorms author_actor_id on Caution for projection-query convenience ("cautions I authored"). Supersession and retirement actors live only on the envelope and may differ from the original author.
Events¶
| Event | Payload sketch | When emitted |
|---|---|---|
CautionRegistered |
caution_id, target, category, severity, text, workaround, author_actor_id, tags, expires_at?, propagate_to_children, parent_caution_id?, occurred_at |
register_caution succeeds, or as the child genesis event in supersede_caution |
CautionSuperseded |
caution_id (parent), by_caution_id (child), occurred_at |
supersede_caution succeeds, written to the parent stream |
CautionRetired |
caution_id, reason, occurred_at |
retire_caution succeeds |
Slices¶
| Command | Category | REST | MCP tool | Idempotency |
|---|---|---|---|---|
RegisterCaution |
NEW | POST /cautions |
register_caution |
required |
SupersedeCaution |
NEW | POST /cautions/{parent_caution_id}/supersede |
supersede_caution |
required |
RetireCaution |
MODIFIED | POST /cautions/{caution_id}/retire |
retire_caution |
none |
GetCaution |
QUERY | GET /cautions/{caution_id} |
get_caution |
none |
ListCautions |
QUERY | GET /cautions |
list_cautions |
none |
Errors per slice. Beyond Pydantic boundary 422s, each slice raises:
RegisterCautionCautionAlreadyExists,InvalidCautionText,InvalidCautionWorkaround,InvalidCautionTag,InvalidCautionExpiresAt,UnauthorizedSupersedeCautionCautionNotFound(parent),CautionCannotSupersede,InvalidCautionSupersedeTarget, plus every errorRegisterCautioncan raise on the child Caution fields,UnauthorizedRetireCautionCautionNotFound,CautionCannotRetire,UnauthorizedGetCautionCautionNotFoundListCautions- (boundary 422 only)
RegisterCaution and SupersedeCaution are wrapped by the Idempotency-Key header pattern for safe operator retry. RetireCaution is strict-not-idempotent: a second retire against an already-Retired Caution raises CautionCannotRetire rather than no-oping.
Storage & Projections¶
One read-side table backs the Caution module.
CREATE TABLE proj_caution_summary (
caution_id UUID PRIMARY KEY,
target_kind TEXT NOT NULL CHECK (
target_kind IN ('Asset', 'Procedure')
),
target_id UUID NOT NULL,
category TEXT NOT NULL CHECK (
category IN ('Wear', 'Calibration', 'Wiring',
'OperationalWindow', 'InterlockQuirk', 'ProcedureGotcha')
),
severity TEXT NOT NULL CHECK (
severity IN ('Notice', 'Caution', 'Warning')
),
text TEXT NOT NULL,
workaround TEXT NOT NULL,
author_actor_id UUID NOT NULL,
tags TEXT[] NOT NULL DEFAULT '{}',
expires_at TIMESTAMPTZ,
propagate_to_children BOOLEAN NOT NULL DEFAULT FALSE,
status TEXT NOT NULL CHECK (
status IN ('Active', 'Superseded', 'Retired')
),
parent_caution_id UUID,
superseded_by_caution_id UUID,
retired_reason TEXT CHECK (
retired_reason IS NULL OR retired_reason IN (
'Resolved', 'NoLongerApplies', 'WrongTarget'
)
),
registered_at TIMESTAMPTZ NOT NULL,
last_status_changed_at TIMESTAMPTZ
);
The CHECK constraints encode the closed CautionStatus, CautionCategory, CautionSeverity, and CautionRetireReason enums at the row level. The (target_kind, target_id) pair is indexed so the Asset and Procedure detail views can fetch their Active Cautions in a single SELECT, and so the cross-module CautionLookup port can answer "which Active Cautions cover this Run's Subject / Assets / Procedures?" without loading aggregates.
GET /cautions/{id} reads from this projection with fold-on-read fallback for fields not yet projected. GET /cautions reads exclusively from the projection with filters on status, target_kind, target_id, category, severity, and tags, plus keyset pagination over (registered_at, caution_id).
The supersession lineage walks superseded_by_caution_id forward (to find the head of a chain) and parent_caution_id backward (to find the chain's root). Today this walk is client-side; a future projection could materialize a head_caution_id column when the rule-of-three trigger fires.
Cross-Module boundaries¶
| Module | Relationship | What's exchanged |
|---|---|---|
| Equipment | shared-id-with | AssetTarget.asset_id references the Asset the Caution attaches to; with propagate_to_children: true, the projection walks Asset.parent_id to surface the Caution on descendant Assets |
| Operation | shared-id-with | ProcedureTarget.procedure_id references the Procedure the Caution attaches to |
| Access | shared-id-with | Caution.author_actor_id references the Actor who first registered the Caution (or the chain's earliest ancestor) |
| Run | reads-from | Run.start calls CautionLookup.find_for_run(subject_id, asset_ids, procedure_ids) against proj_caution_summary; matching Active Cautions are returned as a banner on the response, never gate the start |
| (any) | writes-to via append_streams |
supersede_caution writes CautionSuperseded to the parent stream and CautionRegistered to the child stream atomically in a single Postgres transaction; all-or-nothing, a ConcurrencyError on either stream rolls back the whole commit |
Target references are validated for UUID shape at the API boundary but not for existence at write time; the eventual-consistency stance lets a Caution be registered before its target Asset or Procedure exists in projection state.
Examples¶
The four examples below follow the canonical path for one Caution: register an Asset quirk, supersede it with a revised workaround, retire it once the underlying issue is fixed, and query the projection. The caller's principal becomes the author_actor_id at registration; subsequent supersede and retire actions carry the actor only on the event envelope. For the REST/MCP equivalence, auth, and idempotency conventions these examples share, see Reading the examples on the Modules landing page.
Register an Asset Caution¶
POST /cautions
Content-Type: application/json
Idempotency-Key: 9f6a3b1c-8e2d-4f5a-9b8c-1d2e3f4a5b6c
X-Principal-Id: 11111111-2222-3333-4444-555555555555
{
"target": {"target_kind": "Asset", "asset_id": "aaaa1111-2222-3333-4444-555555555555"},
"category": "OperationalWindow",
"severity": "Caution",
"text": "Hexapod stalls below 0.5 mm/s after thermal-soak completes; observed twice in cycle 2026-1.",
"workaround": "Wait 10 minutes after thermal-soak Completed event before commanding any move; or use stage_speed_override=1.0 mm/s for the first move.",
"tags": ["hexapod", "thermal", "post-soak"],
"expires_at": "2026-12-31T23:59:59Z",
"propagate_to_children": false
}
A successful call returns 201 Created with the newly-assigned caution_id. The Caution starts in Active state and surfaces immediately on the Asset's detail page and on Run.start banners for Runs that bind the Asset.
mcp.call_tool(
"register_caution",
{
"target": {"target_kind": "Asset", "asset_id": "aaaa1111-2222-3333-4444-555555555555"},
"category": "OperationalWindow",
"severity": "Caution",
"text": "Hexapod stalls below 0.5 mm/s after thermal-soak completes; observed twice in cycle 2026-1.",
"workaround": "Wait 10 minutes after thermal-soak Completed event before commanding any move; or use stage_speed_override=1.0 mm/s for the first move.",
"tags": ["hexapod", "thermal", "post-soak"],
"expires_at": "2026-12-31T23:59:59Z",
"propagate_to_children": False,
},
)
Supersede with a revised workaround¶
POST /cautions/9f6a3b1c-8e2d-4f5a-9b8c-1d2e3f4a5b6c/supersede
Content-Type: application/json
Idempotency-Key: 7c8d9e0f-1a2b-3c4d-5e6f-7a8b9c0d1e2f
X-Principal-Id: 22222222-3333-4444-5555-666666666666
{
"category": "OperationalWindow",
"severity": "Caution",
"text": "Hexapod stalls below 0.5 mm/s after thermal-soak completes (cycle 2026-1 + 2026-2 observed).",
"workaround": "Use the new stage_warmup_completed projection signal as the start gate, rather than the 10-minute wait. Falls back to wait if signal is absent.",
"tags": ["hexapod", "thermal", "post-soak", "warmup-signal"],
"propagate_to_children": false
}
Supersede atomically retires the parent Caution and creates a new child carrying the revised text and workaround. The child inherits the parent's target (an attempt to retarget raises InvalidCautionSupersedeTarget); other fields may change freely. The response carries the new child caution_id.
mcp.call_tool(
"supersede_caution",
{
"parent_caution_id": "9f6a3b1c-8e2d-4f5a-9b8c-1d2e3f4a5b6c",
"category": "OperationalWindow",
"severity": "Caution",
"text": "Hexapod stalls below 0.5 mm/s after thermal-soak completes (cycle 2026-1 + 2026-2 observed).",
"workaround": "Use the new stage_warmup_completed projection signal as the start gate, rather than the 10-minute wait. Falls back to wait if signal is absent.",
"tags": ["hexapod", "thermal", "post-soak", "warmup-signal"],
"propagate_to_children": False,
},
)
Retire when the underlying issue is fixed¶
POST /cautions/<child-caution-id>/retire
Content-Type: application/json
X-Principal-Id: 33333333-4444-5555-6666-777777777777
{"reason": "Resolved"}
reason is a closed enum: Resolved (the underlying defect was fixed and the workaround is no longer needed), NoLongerApplies (situation has changed; for example the Asset was replaced), or WrongTarget (the Caution should never have been written for this target). The Caution moves to Retired and stops appearing in Active queries and Run.start banners.
List Active Cautions on an Asset¶
GET /cautions?status=Active&target_kind=Asset&target_id=aaaa1111-2222-3333-4444-555555555555
X-Principal-Id: 11111111-2222-3333-4444-555555555555
Returns the page of Active Cautions sorted by severity (Warning, then Caution, then Notice) and registration time. The response shape matches the projection columns: each item carries caution_id, target, category, severity, text, workaround, tags, expires_at, registered_at, plus the supersession lineage pointers for chain navigation.
The same query without target_kind and target_id returns Active Cautions across all targets, paginated. The CautionLookup port that Run.start uses runs essentially the same query for the union of the Run's Subject id, Asset ids, and Procedure ids.