π Workspace Structure
Hereβs how your files will be organized:
<skill>-workspace/
βββ iteration-1/
βββ .lock.json
βββ eval-1/
β βββ with_skill/
β β βββ outputs/
β β βββ timing.json
β β βββ grading.json
β βββ baseline/
β βββ outputs/
β βββ timing.json
β βββ grading.json
βββ benchmark.json
π·οΈ Running with
--models? Each eval gets a model-key subdirectory:eval-1/pi-claude-sonnet/with_skill/....
The .lock.json file π
Every iteration writes a .lock.json file that tracks which evals have finished. Itβs what makes --resume work! You donβt need to edit it by hand, but itβs useful to know it exists. If a run is interrupted, the lock stays in "running" status; finish it up with skill-eval run --resume.
π¦ Exit Codes
| Code | Meaning |
|---|---|
| 0 | Yay! Everything completed successfully. |
| 1 | Oops! Something went wrong (agent crash, config error, etc.). |