Library · MIT

context-steward

Lazy skill loading for agent systems. The npm package we extracted from Bouletteproof OS and open-sourced.

The problem

Claude Skills are one of the best ergonomic wins in the Anthropic stack. Drop a SKILL.md next to a folder, describe when it should trigger, and the model figures out when to load it. In theory, perfect.

In practice, you end up with twenty skills in a production agent and every conversation pays the token cost for all twenty descriptors — even if the user just asks "what time is it?" A single Word-document skill costs around 600 tokens of description. Twenty skills is a 12,000-token tax on every turn, every user, forever.

The obvious fix is lazy loading: only include a skill's description if the conversation plausibly needs it. The less-obvious problem is that "plausibly needs it" is a retrieval problem dressed up as a configuration problem — and getting it wrong breaks the magic.

What context-steward does

context-steward runs the loading policy between your agent runtime and Claude. It keeps a lightweight index of skill descriptions, runs a fast semantic lookup against the current turn, and only injects the relevant descriptors into the prompt. Skills the user is unlikely to need stay out of context.

Three design choices matter:

Graceful degradation. If the lookup returns nothing, the agent still works — it just won't know about the skills it didn't retrieve. Your fallback behaviour is whatever Claude would do without the skill system, not a broken call.
Transparent injection. The injected descriptors use the exact same format Anthropic specifies for native Skills. The model doesn't know it's being retrieved-and-injected — which means the Skills magic keeps working.
Observability. Every retrieval is logged with the query, the candidates considered, and the match scores. When the agent does something weird, you can see exactly which skills it had access to at the time.

Where it came from

context-steward was extracted from Bouletteproof OS — our supervised multi-agent delivery platform — after the same problem kept surfacing across every agent we built. We ran the pattern in production for several months before deciding the core mechanic was too generic to keep internal.

The npm package is the boring core: retrieval logic, descriptor injection, loader interface. It doesn't include BPOS-specific orchestration. That's a feature — the whole point is that any agent runtime can adopt it without buying into our entire stack.

Install

npm install context-steward

Docs and quickstart are on GitHub. Contributions welcome.

Interested in what we're building on top of this?

More writing