One URL change. Zero agent modification.
Works with any LLM β local or cloud.
Open source Β· Free forever Β· Apache 2.0
Most agent frameworks rebuild the entire context on every request β system prompts, tool schemas, conversation history β regardless of what you asked.
Some agents truncate aggressively, losing important information. Others send everything, drowning the model in irrelevant context. Neither approach scales.
Sieve replaces both with intelligent retrieval β sending only what matters, without losing what's important.
Sieve sits transparently between your agent and your LLM. Instead of truncating or bloating, it retrieves only the relevant context from a structured memory store β delivering a lean, precise payload every time.
Validated across hundreds of queries over 30+ simulated days with cross-family grading
Baseline models degrade as conversation grows. Sieve improves.
Don't trust our numbers? Run sieve benchmark on your own machine.
Excellent projects solving different facets of the same problem. Here's where Sieve fits.
| Approach | What it does well | Integration | Sieve adds |
|---|---|---|---|
| Raw agent context | Simple, no setup needed | N/A | Reduces bloat without changing the agent |
| Agent + compaction | Keeps context manageable | Built-in | More precise retrieval vs crude truncation |
| RAG systems | Document retrieval at scale | Requires SDK integration | Transparent proxy β no code changes |
| Virtual context managers | Sophisticated memory management | Requires SDK changes | Drop-in proxy, works alongside |
| Sieve | Token reduction + structured memory | Transparent proxy | β |
Sieve is complementary. It works alongside any of these approaches β reducing what gets sent to the model regardless of how the context was assembled.
Sieve's context reduction improves with every conversation
Fewer tokens per request = lower API costs. Plus memory and anti-hallucination that cloud APIs don't provide.
Sieve is released under the Apache 2.0 licence. No hidden costs, no usage limits, no telemetry, no data collection. Your memory store stays on your machine β encrypted, private, and entirely under your control.