SE
Skill Eval

Changelog

1.0.0 (2026-06-25)

Features

  • add –verbose flag and structured logging (e6e41b9)
  • add cross-iteration delta to benchmark.json (e85d0d7)
  • add deterministic file/text assertion matchers (5011275)
  • add iteration lockfile and –resume (b39f004)
  • validate config against JSON schema (cf9b3f4)

0.6.0 (2026-06-25)

Features

  • –models flag for cross-model benchmarking (9fd5b45)
  • pre-push-guard pi extension (500b005)
  • pre-push-guard shows live status widget while checks run (ff5e658)

Bug Fixes

  • use <video> not for auto-fixing demo mp4 (6d9c5e3)
  • use <video> not for cross-model demo mp4 (8cdceb8)
  • use agent:model pairs in all cross-model examples, regenerate VHS (fd1a283)
  • use proper <video></video> closing tags, agent:model pairs throughout cross-model guide (8d3cbf5)

0.5.0 (2026-06-25)

Features

  • –fix flag for auto-refinement loop (evaluator-optimizer pattern) (fa85108)

Bug Fixes

  • a11y: modern-web-guidance improvements (3111adf)
  • ignore err from CombinedOutput in fixEval (grading handles failures) (e4fff21)

0.4.0 (2026-06-25)

Features

  • add view transitions for smooth page navigation (bcfc382)
  • docs: reorder docs menu, rename README to Home, add GitHub link (7d92c20)
  • docs: wire full SEO stack (7f82d89)

Bug Fixes

  • docs: compile images at build time to avoid 404 _image endpoint (b8da6b2)
  • docs: put Home at the top of the docs menu (8d42852)
  • docs: remove stale wrangler deploy config and add pages deploy script (ea75a79)
  • menu layout, guide ordering, and larger GIFs (c11bb0e)
  • VHS tapes use harness scripts, type real commands not comments (7690633)

0.3.0 (2026-06-25)

Features

  • docs: add CSP header and optimize font loading (b02a168)
  • docs: add mobile hamburger navigation (4305673)
  • migrate documentation to Astro site and modular markdown (614e051)

Bug Fixes

  • docs: add custom 404 error page (a6f2118)

0.2.0 (2026-06-24)

Features

Bug Fixes

  • resolve golangci-lint errors (ac9abc3)