TL;DR
Semble is a code search tool designed for AI agents that drastically reduces token usage—by about 98%—while maintaining high accuracy. It indexes repos quickly and enables instant code retrieval without external services. This could significantly improve agent performance and cost-efficiency.
Semble, a new code search library tailored for AI agents, claims to reduce token consumption by approximately 98% compared to traditional grep+read methods, enabling faster and more efficient code retrieval without external dependencies.
Developed to serve agents such as Claude, Codex, and Cursor, Semble indexes entire codebases in around 250 milliseconds and answers queries in approximately 1.5 milliseconds, all on CPU without requiring API keys, GPUs, or external services. It achieves these speeds while maintaining a retrieval quality comparable to specialized transformer models, with a benchmark NDCG@10 score of 0.854.
Semble can be integrated as an MCP server or used via command-line interfaces, supporting local paths or remote git repositories. It offers features like semantic code search, related code discovery, and automatic re-indexing on file changes. The tool emphasizes token efficiency, returning only relevant code chunks, which significantly reduces token counts during searches.
Why It Matters
This development matters because it addresses key challenges in AI code understanding: reducing token usage lowers costs and improves speed, enabling more scalable and responsive code search within AI agents. Its independence from external services and GPU reliance makes it accessible for local deployment, broadening its potential use cases.
By maintaining high accuracy with minimal token consumption, Semble could enhance the performance of AI-powered development tools, improve developer workflows, and reduce operational costs in environments where token limits and latency are critical.

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light
【Vehicle CEL Doctor】The NT301 obd2 scanner enables you to read DTCs, access to e-missions readiness status, turn off…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Traditional code search methods like grep are simple but can be inefficient and token-heavy when integrated into AI workflows. Recent advances have focused on transformer-based models for code retrieval, but these often require significant computational resources. Semble emerges as a lightweight alternative, leveraging efficient indexing and retrieval techniques. Its announcement follows ongoing efforts to optimize AI tooling for code comprehension and search, especially for agent-based applications.
“Semble indexes an average repo in ~250 ms and answers queries in ~1.5 ms, all on CPU, with 98% fewer tokens than grep+read.”
— Semble development team
“Semble achieves a NDCG@10 score of 0.854, on par with code-specialized transformer models.”
— Benchmark source (unspecified)

Inateck 2D Barcode Scanner, Wireless Bluetooth QR Code Scanner with AI APP & SDK, 180-Day Battery Life, Fast & Accurate Scanning, Compatible with iOS/Android/Windows
Powerful Scanning Capability: The Inateck 2D barcode scanner accurately reads almost all 1D and 2D barcodes within a…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It is not yet clear how Semble performs across diverse codebases or in real-world agent deployments, and whether there are limitations in certain programming languages or project sizes.
![DeskFX Free Audio Effects & Audio Enhancer Software [PC Download]](https://m.media-amazon.com/images/I/41fXbDohyuS._SL500_.jpg)
DeskFX Free Audio Effects & Audio Enhancer Software [PC Download]
Transform audio playing via your speakers and headphones
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Next steps include broader testing of Semble in various environments, integration with more agents, and potential open-source adoption. Further benchmarks and user feedback will clarify its effectiveness and limitations.

Perplexity AI, A Practical Guide to Powered Research: Smart Search, Workflow Automation, and Knowledge Discovery in 2026 (The Practical Guide to Modern AI Tools)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
How does Semble achieve such a high reduction in token usage?
Semble returns only the relevant code chunks needed for a query, using approximately 98% fewer tokens than traditional grep+read methods, by leveraging efficient indexing and semantic retrieval techniques.
Can Semble be used with any codebase or programming language?
Semble supports local directories and git repositories, but its performance across different languages and project types remains to be fully tested. It is designed to be language-agnostic, relying on indexing and semantic search.
Does Semble require external services or GPUs?
No. Semble runs entirely on CPU, with no API keys, GPUs, or external dependencies needed, making it accessible for local deployment.
How does Semble compare to transformer-based code search models?
According to benchmarks, Semble offers similar retrieval quality (NDCG@10 of 0.854) but at a fraction of the size and computational cost, with much faster indexing and querying times.
What are the next steps for developers interested in using Semble?
Developers can install Semble via pip or uv, integrate it into their agents or scripts, and monitor token savings and performance improvements as they adopt the tool.