Same work.
60% fewer tokens.

Terminal AI coding agent. Every prompt trimmed. Every reply terse. Every output compressed.

GitHub

MIT Licensev1.1.3

Real numbers

This data is based on 1 prompt to build this site.

517.5K

tokens saved

39,710

context used

425

optimizations applied

What it does

Same coding agent.
Leaner every step.

BJIR wraps your existing CLI coding agent and applies aggressive token optimization at every layer: input, context, output, and tool I/O.

It runs locally. No proxy, no cloud, no data leaves your machine. Every compression technique is compiled in and runs in-process.

The result: same quality work, dramatically lower cost, faster responses, and sessions that last 2–3x longer before hitting limits.

Optimizations

Six layers of token compression.

Each optimization targets a different part of the token pipeline. Together they compound.

context management

33 ops249.0K saved

Context Prune

Drops stale conversation history each turn. Keeps what matters, discards what doesn't. The single biggest token saver, 249K tokens in one session.

web fetching

105 ops124.1K saved

I/O: Webfetch

Compresses web fetch results before they enter context. Strips HTML noise, collapses whitespace, extracts only relevant content from URLs.

output compression

114 ops15.1K saved

SmartCrush + Response Control

Crushes JSON payloads to minimal form. Compresses model output: abbreviates prose, strips filler, enforces terse responses. 114 operations in a single session.

shell & process wrapping

99 ops7.9K saved

I/O: Bash + RTK

Intercepts bash command output and wraps shell processes (git, cargo, npm, docker). Keeps only actionable lines: errors, warnings, changed files. Drops repetitive noise.

input trimming

39 ops10 saved

Prompt Refine

Trims prompts before they reach the model. Strips redundant whitespace, collapses repeated context, compresses system instructions.

read caching

35 ops18 saved

I/O: Read Dedup

Caches file reads and skips re-sending content already seen this session. Small per-read savings that compound across large codebases.

Works with

ClaudeOpenAIGeminiLlamaMistralGroqOllamaTogetherDeepSeekOpenRouterLM StudioClaudeOpenAIGeminiLlamaMistralGroqOllamaTogetherDeepSeekOpenRouterLM StudioClaudeOpenAIGeminiLlamaMistralGroqOllamaTogetherDeepSeekOpenRouterLM StudioClaudeOpenAIGeminiLlamaMistralGroqOllamaTogetherDeepSeekOpenRouterLM Studio

Get started

Install in 10 seconds.

Install the package

One global command. Works on macOS, Linux, and WSL.

Launch BJIR

Run bjir in any project directory. Connects to your configured provider.

Start coding

Same agent, same capabilities. Just fewer tokens per request.

Check your savings

Run bjir gain to see token reduction metrics for your session.

Install from source

$ git clone https://github.com/gogetrekt/bjir.git

$ cd bjir

$ cargo install --path .

CLI Reference

Commands.

Every command runs locally. No telemetry, no cloud calls.

Command	Description
bjir	Launch the interactive agent
bjir run	Run agent in single-shot mode
bjir gain	Show token savings analytics
bjir gain --history	Show command history with savings
bjir discover	Analyze history for missed optimization opportunities
bjir proxy <cmd>	Execute raw command without filtering (debugging)
bjir --version	Print version number
bjir --help	Show help and usage info

Same work.60% fewer tokens.

Same coding agent.Leaner every step.