REVIEW ONLY · not shown in product — state 1 of 3 · happy path

C Crucible acme-fraud ·

/ · run launcher

SESSION · 7 runs$61.40 / $250.00 moas of 14:07:52Z

          Live updates · connected
          
        

Target adapter ✓ selected

Fraud ✓

Transaction-fraud classifier adapter

· validated 2026-06-19 · owner: risk-eng

Code Agent

Autonomous code-generation agent

code_agent@3f9a · validated 2026-06-11 · owner: dev-tools

1 adapter disabled · Research Agent

Sealed specification · YAML

✓ sealed · valid

              spec.fraud.sealed.yaml · 12 lines
            

              
              · v3.2.1 · author: j.okafor · created 2026-06-21 14:02Z ·

1spec:
target: fraud_adapter
sealed: true ✓
obligations:
  - held_out_tests.pass_rate >= 0.95
  - metamorphic.invariance == true ✓
  - differential.cross_family_drift <= 0.03
oracles: [held_out, metamorphic, differential, fuzz]
judge:
  enabled: true ✓
  weight: one_vote
12# sealed at submit · sha 9f2a4c7b

Evaluation budget

Rounds · operator entry

Dollar ceiling · operator entry

            workspace ceiling
            $25.00
            
            ✓ within policy · no approver required
          

≈  · hard stop at $25.00

RUN SUMMARY

TargetFraud

Oracles4 + judge

Rounds48

Est. spend$8.40 – $25.00

consumed · — / $25.00 (run not started)

1 / 5 Scoring rule: the LLM judge carries one of five votes and is adversarially gameable. The four independent oracles carry the rest.

✓ I acknowledge the LLM judge carries one of five votes and is adversarially gameable.

acknowledged 14:07:48Z by j.okafor

: ack.operator=j.okafor · ack.ts=14:07:48Z
ack.spec_sha=9f2a4c7b · ack.judge_weight=1/5

on start → /runs/:runId

SANDBOX ENVIRONMENT

runtimepython 3.12 · linux · docker
networksealed (egress=deny)
imagecrucible/sandbox:v1.4.2

REVIEW ONLY · not shown in product — state 2 of 3 · empty

C Crucible acme-fraud ·

/ · run launcher

            EST / RUN
            —
            as of 14:07:52Z
          
SESSION · 0 runs$0.00 / $250.00 moas of 14:07:52Z

          Live updates · connected
          
        

START HERE Pick a target adapter, paste a sealed YAML spec, set a budget, and acknowledge the judge weighting. All four gates are required before an evaluation can start — the judge acknowledgment is the fourth gate, in the right rail.

Target adapter gate 1 of 4 · target

Fraud

Transaction-fraud classifier adapter

fraud_adapter@7c1d · validated 2026-06-19 · owner: risk-eng

Code Agent

Autonomous code-generation agent

code_agent@3f9a · validated 2026-06-11 · owner: dev-tools

1 adapter disabled · Research Agent

Sealed specification · YAML gate 2 of 4 · spec

Paste your sealed YAML spec

The spec lists the obligations each oracle must verify. It is sealed at submit time so the adversarial agent can't read it.

Evaluation budget · gate 3 of 4 · budget

Rounds

integer · capped at workspace round-budget

Dollar ceiling

USD · capped at workspace ceiling $25.00

            workspace ceiling
            $25.00

RUN SUMMARY

Target— not picked

Spec— not pasted

Budget— not set

Not yet measured

No estimate yet. Pick a target, paste a spec, and set a budget — the estimate will appear here.

1 / 5 Scoring rule: the LLM judge carries one of five votes and is adversarially gameable.

I acknowledge the LLM judge carries one of five votes and is adversarially gameable.

required before start · not yet acknowledged

4 gates remaining: target, spec, budget, judge ack

REVIEW ONLY · not shown in product — state 3 of 3 · validation error

C Crucible acme-fraud ·

/ · run launcher

            EST / RUN
            —
            as of 14:07:52Z
          
SESSION · 7 runs$61.40 / $250.00 moas of 14:07:52Z

          Live updates · connected
          
        

✕ PARSE ERROR Spec was not sealed — validation stopped at line 6, column 5. Your entries are preserved.

attempt id: a3f9c2… · operator: j.okafor · validator: spec-validator-v1.4 · audit entry:  · parent of: next attempt

Target adapter ✓ preserved

Fraud✓

Transaction-fraud classifier adapter

fraud_adapter@7c1d · validated 2026-06-19 · owner: risk-eng

Code Agent

Autonomous code-generation agent

code_agent@3f9a · validated 2026-06-11 · owner: dev-tools

1 adapter disabled · Research Agent

Sealed specification · YAML ✕ INVALID

              spec.fraud.yaml · parse failed
            

1spec:
target: fraud_adapter
obligations:
  - held_out_tests.pass_rate >= 0.95
  - metamorphic.invariance == true
  sealed true◄ expected ':'
oracles: [held_out, metamorphic, ...]

YAMLParseError: mapping value expected

at line 6, column 5 — key sealed is missing its : separator before true.

Without a valid map the spec can't be sealed, so no oracle obligations were registered.

last validated · 14:07:52Z

creates new attempt id linked to a3f9c2 · previous attempt remains immutable in audit log

Evaluation budget · preserved

Rounds

Dollar ceiling

RUN SUMMARY

TargetFraud

Specinvalid

Rounds48

✕Start is blocked until the spec parses and seals. Path back to a runnable state:

1Fix line 6 — add the missing :
2Re-validate the spec
3Re-seal the spec
4Start evaluation

blocked · 1 parse error