Same work.
60% fewer tokens.
Terminal AI coding agent. Every prompt trimmed. Every reply terse. Every output compressed.
Terminal AI coding agent. Every prompt trimmed. Every reply terse. Every output compressed.
This data is based on 1 prompt to build this site.
BJIR wraps your existing CLI coding agent and applies aggressive token optimization at every layer: input, context, output, and tool I/O.
It runs locally. No proxy, no cloud, no data leaves your machine. Every compression technique is compiled in and runs in-process.
The result: same quality work, dramatically lower cost, faster responses, and sessions that last 2–3x longer before hitting limits.
Each optimization targets a different part of the token pipeline. Together they compound.
Drops stale conversation history each turn. Keeps what matters, discards what doesn't. The single biggest token saver, 249K tokens in one session.
Compresses web fetch results before they enter context. Strips HTML noise, collapses whitespace, extracts only relevant content from URLs.
Crushes JSON payloads to minimal form. Compresses model output: abbreviates prose, strips filler, enforces terse responses. 114 operations in a single session.
Intercepts bash command output and wraps shell processes (git, cargo, npm, docker). Keeps only actionable lines: errors, warnings, changed files. Drops repetitive noise.
Trims prompts before they reach the model. Strips redundant whitespace, collapses repeated context, compresses system instructions.
Caches file reads and skips re-sending content already seen this session. Small per-read savings that compound across large codebases.
One global command. Works on macOS, Linux, and WSL.
Run bjir in any project directory. Connects to your configured provider.
Same agent, same capabilities. Just fewer tokens per request.
Run bjir gain to see token reduction metrics for your session.
Every command runs locally. No telemetry, no cloud calls.