ragpilot · Orilyon

// THE PROBLEM

Agents drown in context they don't need

When an LLM agent works on a codebase, the naive approach is to read whole files into the prompt. That burns tokens, blows past context limits on large repos, and buries the relevant lines in noise. ragpilot flips this: it builds a semantic + structural index of your code and serves precise, ranked snippets on demand — so the agent sees what's relevant and nothing else, while your source never leaves the machine.

// CAPABILITIES

What ragpilot does

Semantic code search

Embedding-based retrieval over a local vector store finds code by meaning, not just keywords — so "where do we validate auth tokens" returns the right function even if the wording differs.

Model Context Protocol server

Speaks MCP natively, so Claude Code and other MCP-aware agents can call its search, navigation, and impact tools directly — no glue code, no custom adapters.

Tree-sitter symbol graph

Parses code structurally to resolve symbols, build call graphs, and return whole definitions — giving agents precise navigation instead of fuzzy line ranges.

Pre-refactor impact analysis

Before a change ships, ask ragpilot what it touches. It traces dependents and flags breaking signals so agents plan refactors instead of discovering damage after the fact.

Incremental, real-time indexing

A filesystem watcher and Git hooks re-index only what changed. The index stays current as you work, with no full re-embed on every edit.

Zero cloud data leakage

Embeddings run with a local model and the index is stored on disk. Your proprietary code is never sent to a third-party API — safe for regulated and air-gapped environments.

// HOW IT WORKS

From repository to retrieval in four steps

Parse & chunk

ragpilot walks the repo, parses each file with Tree-sitter, and splits it into structural chunks (functions, classes, blocks) instead of arbitrary line windows — keeping semantic units intact.

Embed & store

Each chunk is embedded with a local model and written to a vector store, alongside a SQLite metadata index that tracks symbols, ranges, and file state for fast incremental updates.

Serve over MCP

The server exposes tools — search, symbol resolve, call graph, impact analyze, context bundle — that an agent invokes through the Model Context Protocol with a strict token budget.

Stay in sync

A watcher marks files dirty on change and Git hooks trigger re-indexing, so the next query always reflects the latest code without a manual rebuild.

// AT A GLANCE

Technical profile

Language

Rust — single static binary, no runtime dependencies

Vector store

Qdrant for embeddings; SQLite for symbol & file metadata

Parsing

Tree-sitter grammars for multi-language structural chunking

Interface

Model Context Protocol (MCP) server + CLI

Privacy

Local embedding model, on-disk index, zero external calls

License

MIT — auditable and free to self-host

Rust Qdrant SQLite Tree-sitter MCP

// GET STARTED

Run ragpilot on your own code

Clone it, index your repo, and point your agent at the MCP endpoint. It's MIT-licensed and runs entirely offline.

github.com/alikaya/ragpilot