{"id":5924,"date":"2026-06-22T15:49:52","date_gmt":"2026-06-22T08:49:52","guid":{"rendered":"https:\/\/daiilynews.cu.ma\/?p=5924"},"modified":"2026-06-22T15:49:52","modified_gmt":"2026-06-22T08:49:52","slug":"xfloukiex-lab-magpie-search-federated-local-first-search-for-an-ai-one-query-across-transcripts-files-knowledge-graph-vector-store-and-the-web-fused-by-trust-weighted-rrf-apache-2-0","status":"publish","type":"post","link":"https:\/\/daiilynews.cu.ma\/?p=5924","title":{"rendered":"xfloukiex-lab\/magpie-search: Federated, local-first search for an AI \u2014 one query across transcripts, files, knowledge graph, vector store, and the web, fused by trust-weighted RRF. Apache-2.0. \u00b7 GitHub"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<p>A federated search engine \u2014 the search engine an AI agent or LLM reaches for when it needs to find something true to reason over.<\/p>\n<p>Ever had your computer reboot on you, or a power outage hit mid-session? Every<br \/>\nthread your agent was holding \u2014 gone. Now you have the tool to get it back.<br \/>\nNever forget what your agent lost again. Magpie indexes everything your AI<br \/>\nhas ever worked through, locally, so a crash is a hiccup instead of amnesia.<\/p>\n<p>A normal search engine looks in one place. Magpie takes one question and fans it<br \/>\nacross everything that matters at once \u2014 the AI&#8217;s entire conversation history,<br \/>\nthe files on the machine, a structured knowledge graph, a vector store, and the live<br \/>\nweb \u2014 and pulls the answer back from wherever it actually lives.<br \/>\nFive sources, one call.<br \/>\nAnd it searches each one the right way. It can grep for an exact string or regex<br \/>\nwhen you know the precise token \u2014 a file path, an error, a line of code. It can<br \/>\nsearch by keyword. It can search by meaning, so it finds the thing even when the<br \/>\nwords don&#8217;t match. It can do all of that at once.<br \/>\nThen it does the part that makes it trustworthy: it fuses everything into a<br \/>\nsingle ranked answer, and every result carries a trust tier \u2014 fact > reference > lead > stale. The solid sources rise, the loose ones are marked as<br \/>\nleads to verify, duplicates collapse, and it&#8217;s all trimmed to fit so it never<br \/>\nfloods the AI&#8217;s context. Ask it to go deep and it expands one question into many,<br \/>\nreads the pages, and tells you how many independent sources agree \u2014 a full<br \/>\nresearch sweep without an army of agents.<br \/>\nIt runs entirely on the machine. No server, no account, and no telemetry<br \/>\nunless you turn it on. The AI&#8217;s transcripts and files never leave. It plugs into<br \/>\nwhatever AI is running over MCP, so the agent can reach all six sources the<br \/>\ninstant it needs them.<br \/>\nIt is a tool for an AI \u2014 an agent or an LLM.<\/p>\n<p>At its core is a local index of the AI&#8217;s transcripts: a SQLite database with two<br \/>\nstructures built side by side \u2014<\/p>\n<p>an FTS5 full-text index (BM25 keyword ranking), and<br \/>\na vector index (sqlite-vec) of 384-dim embeddings produced locally by a<br \/>\nsmall all-MiniLM-L6-v2 model.<\/p>\n<p>Everything is redacted at ingest \u2014 a scrubber strips ~30 classes of secrets<br \/>\n(keys, tokens, private keys, connection strings) before a single byte hits the<br \/>\nindex.<br \/>\nOn top of that index sit the five search modes:<\/p>\n<p>Mode<br \/>\nWhat it does<\/p>\n<p>grep<br \/>\nliteral \/ regex match (exact tokens: paths, errors, code)<\/p>\n<p>lexical<br \/>\nFTS5 \/ BM25 keyword<\/p>\n<p>semantic<br \/>\nembedding K-NN, cosine distance in the vector index<\/p>\n<p>hybrid<br \/>\nlexical + semantic fused by RRF<\/p>\n<p>rerank<br \/>\nhybrid, then a cross-encoder (jina-reranker) re-scores each candidate<\/p>\n<p>Around that sits the federation layer \u2014 the part that makes it federated:<\/p>\n<p>A provider plugin system. Five backends (transcripts, files, knowledge<br \/>\ngraph, vector, web), each returns Hit objects tagged with a trust<br \/>\ntier.<br \/>\nA fan-out: one query goes to all providers concurrently (\u22648 workers), each<br \/>\nwith a 5-second timeout that fails open \u2014 a slow source contributes nothing<br \/>\nrather than blocking the call.<br \/>\nTrust-weighted RRF fusion \u2014 Reciprocal Rank Fusion where each source&#8217;s rank<br \/>\nis multiplied by its trust weight (fact \u00d73, reference \u00d72, lead \u00d71, stale \u00d70.3), damping constant 60. This is the math that merges six heterogeneous<br \/>\nsources into one honest ranking.<br \/>\nCross-source dedup by content hash \u2014 the same fact found in three places<br \/>\ncollapses to one hit, tagged with where else it appeared (corroboration).<br \/>\nA token-budget trim, so the merged set never overflows the calling AI&#8217;s<br \/>\ncontext.<\/p>\n<p>And it exposes all of this to an AI over an MCP server \u2014 the tools it hands<br \/>\nan agent are exactly: search, recent, session, list_sessions, stats,<br \/>\nreindex. Note what&#8217;s not in that list: nothing that writes an answer.<\/p>\n<p>RAG = Retrieval-Augmented Generation. It&#8217;s a two-stage pipeline, and the<br \/>\ndefining stage is the second one: a retriever finds chunks \u2192 they&#8217;re stuffed into<br \/>\na prompt \u2192 a language model generates the prose answer. The &#8220;G&#8221; is the whole<br \/>\npoint of the name; without a generator writing the answer, it isn&#8217;t RAG.<br \/>\nMagpie has no G:<\/p>\n<p>There is no generator anywhere in the search path. Nothing in Magpie<br \/>\ncomposes a natural-language answer. The closest thing to a model \u2014 the<br \/>\ncross-encoder reranker \u2014 outputs a relevance number per result and reorders<br \/>\nthe list. It scores; it never writes a sentence.<br \/>\nIt stops at &#8220;here are the ranked hits.&#8221; A RAG owns the prompt assembly and<br \/>\nthe model call. Magpie returns the fused, trust-ranked results and hands them<br \/>\nback through MCP. What the AI does next \u2014 whether it even generates anything \u2014<br \/>\nis the AI&#8217;s job, outside Magpie.<br \/>\nIts retriever is more than a RAG&#8217;s retriever, not less. A textbook RAG<br \/>\nretriever is one vector store: embed the query, top-k by cosine, done.<br \/>\nMagpie&#8217;s retrieval is six sources, five modes, trust-weighted fusion,<br \/>\ncross-source dedup. It&#8217;s a far more capable &#8220;R&#8221; \u2014 but it&#8217;s still only the R.<\/p>\n<p>Plug Magpie into an AI and the pair can form a RAG \u2014 Magpie is the R, the AI you<br \/>\nbring is the G. But Magpie by itself ships only the R, and a stronger R than<br \/>\nusual. It finds and ranks the truth; it never generates the answer.<br \/>\nDeep web search \u2014 research breadth without the token bill<br \/>\nThe expensive part of &#8220;deep research&#8221; is reasoning, and the multi-agent<br \/>\napproach pays for it N times over \u2014 one full LLM context per agent, often<br \/>\nmillions of tokens for a single question. But reasoning doesn&#8217;t need to fan<br \/>\nout; one capable model already in context can synthesize. Only the searching<br \/>\nneeds breadth \u2014 and searching the web is pure retrieval, zero LLM tokens.<br \/>\nmagpie-search deepweb is built on that asymmetry. It fires several sub-queries<br \/>\nat the web in parallel, fuses them by trust-weighted RRF + dedup-by-URL into one<br \/>\ncompact, token-budget-trimmed source set, optionally reads the top pages&#8217; text<br \/>\n(still token-free), and reports how many independent domains corroborate the<br \/>\nresult \u2014 an agent-free version of the verification a research swarm pays agents<br \/>\nto do.<br \/>\nSo you get the breadth, page-reading, and corroboration of a multi-agent deep<br \/>\nsearch, but your model only pays for a single synthesis pass over a trimmed<br \/>\nresult set.<br \/>\nToken cost, measured \u2014 one deep question:<\/p>\n<p>Approach<br \/>\nTokens the model pays<\/p>\n<p>Multi-agent deep-research swarm (N agents each read pages into their own context)<br \/>\n~2,000,000<\/p>\n<p>magpie-search deepweb &#8211;thorough (6 angles \u2192 12 sources, 12 full pages read)<br \/>\n~1,050<\/p>\n<p>That&#8217;s ~2,000\u00d7 fewer tokens \u2014 about 1\/2000th the cost \u2014 because the searching<br \/>\nand page-reading are pure retrieval (zero model tokens); your model only does<br \/>\nthe final synthesis pass over the trimmed, corroborated set.<br \/>\n# one question, several angles, read the top pages \u2014 all token-free retrieval<br \/>\nmagpie-search deepweb &#8220;the question&#8221; &#8211;q &#8220;another angle&#8221; &#8211;q &#8220;a third angle&#8221; &#8211;thorough<br \/>\nThe model in your loop then does one synthesis pass over the merged, corroborated<br \/>\nset. That&#8217;s the whole saving: the breadth is free, you pay only for the answer.<\/p>\n<p>pip install magpie-search<br \/>\nOr install the latest straight from source (pulls all dependencies):<br \/>\npip install &#8220;git+https:\/\/github.com\/xfloukiex-lab\/magpie-search.git&#8221;<br \/>\nOptional \u2014 add the local-LLM features (the cross-encoder reranker runs on the<br \/>\nbase install; the session summarizer needs Ollama):<br \/>\n# 1. Install Ollama (free, runs entirely locally) \u2014 https:\/\/ollama.com\/download<br \/>\n# 2. Pull the model magpie-search uses<br \/>\nollama pull phi3.5<br \/>\nPython 3.10+ on Windows, macOS, and Linux.<\/p>\n<p>magpie-search index                               # build the index (incremental)<br \/>\nmagpie-search search &#8220;that retry backoff thing&#8221;   # keyword search<br \/>\nmagpie-search search &#8211;mode hybrid &#8220;&#8230;&#8221;          # keyword + semantic, fused<br \/>\nmagpie-search search &#8211;mode rerank &#8220;&#8230;&#8221;          # + cross-encoder rerank<br \/>\nmagpie-search stats                               # sanity-check the index<br \/>\nConnect it to your AI (MCP)<br \/>\nMagpie speaks the Model Context Protocol, so any MCP-capable agent can call it.<br \/>\nPoint your client at the bundled server:<\/p>\n<p>The agent then has search, recent, session, list_sessions, stats, and<br \/>\nreindex available \u2014 federated, trust-ranked, context-budgeted.<\/p>\n<p>Command<br \/>\nWhat<\/p>\n<p>magpie-search index<br \/>\nIncremental indexing pass over ~\/.claude\/projects\/<\/p>\n<p>magpie-search search &#8220;q&#8221;<br \/>\nSearch \u2014 &#8211;mode grep|lexical|semantic|hybrid|rerank<\/p>\n<p>magpie-search recent &#8211;n 30<br \/>\nLatest 30 messages of the newest session<\/p>\n<p>magpie-search session SESSION-ID<br \/>\nFull transcript of one session<\/p>\n<p>magpie-search list<br \/>\nRecent sessions<\/p>\n<p>magpie-search stats<br \/>\nIndex size, last-indexed time, row counts<\/p>\n<p>magpie-search backup<br \/>\nBack up ~\/.claude\/projects\/ to a configurable destination<\/p>\n<p>Add &#8211;help to any command for full options.<\/p>\n<p>import magpie_search<\/p>\n<p>results = magpie_search.search(&#8220;retry backoff&#8221;, mode=&#8221;hybrid&#8221;, k=5)<br \/>\nfor h in results(&#8220;hits&#8221;):<br \/>\n    print(h(&#8220;trust&#8221;), h(&#8220;source&#8221;), h(&#8220;snippet&#8221;))<\/p>\n<p># LLM features (needs Ollama + phi3.5)<br \/>\nimport magpie_search.llm<br \/>\nranked  = magpie_search.llm.search_rerank(query=&#8221;retry backoff&#8221;, k=3, pool=10)<br \/>\nsummary = magpie_search.llm.summarize(session_id=&#8221;abc-123&#8243;, n_messages=80)<\/p>\n<p>magpie-search backup copies your transcript tree to a destination of your<br \/>\nchoice \u2014 a local folder (default, zero config), a remote SSH target (NAS \/ home<br \/>\nserver), or a remote SSH target with VM boot\/suspend. Configure it in<br \/>\n~\/.magpie-search\/backup.env:<br \/>\nMAGPIE_SEARCH_BACKUP_SSH_HOST=user@nas.local<br \/>\nMAGPIE_SEARCH_BACKUP_SSH_DEST=~\/claude-transcripts\/<br \/>\nUseful flags: &#8211;dry-run, &#8211;no-suspend, &#8211;show-config. Backup copies; it<br \/>\nnever deletes originals.<\/p>\n<p>Everything is environment-variable driven with sensible defaults.<\/p>\n<p>Var<br \/>\nDefault<br \/>\nWhat<\/p>\n<p>MAGPIE_SEARCH_HOME<br \/>\n~\/.magpie-search<br \/>\nData directory (DB, models, logs)<\/p>\n<p>MAGPIE_SEARCH_MODELS_DIR<br \/>\n$MAGPIE_SEARCH_HOME\/models<br \/>\nfastembed model cache<\/p>\n<p>MAGPIE_SEARCH_OLLAMA_HOST<br \/>\nhttp:\/\/localhost:11434<br \/>\nOllama server URL<\/p>\n<p>MAGPIE_SEARCH_TOKENIZER<br \/>\nheuristic<br \/>\nSet to tiktoken for precise budget counting<\/p>\n<p>MAGPIE_SEARCH_AUDIT_LOG<br \/>\n$MAGPIE_SEARCH_HOME\/llm-audit.jsonl<br \/>\nPer-call audit log<\/p>\n<p>The summarizer passes through a 6-probe guardrail stack (length,<br \/>\nproper-noun-safety, identifier-safety, refusal-drift, semantic-grounding,<br \/>\nself-verify); all six must pass for trust: clean. Any failure suppresses the<br \/>\nsummary and returns trust: degraded \u2014 quiet over wrong. Raw messages stay<br \/>\naccessible via magpie-search session SESSION-ID.<\/p>\n<p>Magpie Search is a local tool. No server, no account, no auto-update, no crash<br \/>\nreporter, and no telemetry unless you explicitly opt in (see below). Your<br \/>\ntranscripts, the index, the audit log, the model cache, and the backups all live<br \/>\non your machine.<br \/>\nOpt-in telemetry. Telemetry is off by default \u2014 magpie sends nothing<br \/>\nuntil you run magpie-search telemetry enable (or set<br \/>\nMAGPIE_SEARCH_TELEMETRY=1). When on, it sends only anonymous usage: which<br \/>\ncommand ran, search mode, result\/hit counts, latency, error class, and your<br \/>\nmagpie\/python\/OS versions, tagged with a random install id. It never sends<br \/>\nyour queries, file paths, results, transcript content, username, or IP \u2014 a<br \/>\nhard content firewall in telemetry.py drops anything that isn&#8217;t a number or a<br \/>\nshort enum token. Disable anytime with magpie-search telemetry disable; check<br \/>\nstate with magpie-search telemetry status. The only<br \/>\nnetwork calls it ever makes are: your local Ollama server (LLM features), your<br \/>\nown backup target (only when you run backup), and a one-time model download<br \/>\nfrom Hugging Face on first run. Verify it yourself with tcpdump, Wireshark, or<br \/>\na network-blocked sandbox.<\/p>\n<p>Run magpie-search index (and optionally backup) on a schedule. Ready-made<br \/>\nunits live in installers\/ for systemd (Linux), launchd (macOS),<br \/>\nand Task Scheduler (Windows).<\/p>\n<p>&#8220;rsync not on PATH&#8221; \u2014 falls back to scp -r. On Windows, install<br \/>\nGit for Windows, which ships rsync.<br \/>\nSearch returns nothing \u2014 run magpie-search stats; if last_indexed_at is<br \/>\nnull, run magpie-search index.<br \/>\nSummarizer always degraded \u2014 that&#8217;s the false-positive guard working as<br \/>\ndesigned. Raw transcripts remain available via session SESSION-ID.<\/p>\n<p>Magpie Search is built by VektorGeist LLC.<br \/>\nWe build local-first tools for people who run their own AI. Magpie is the search<br \/>\ncore; our agent platform is at vektorgeist.com.<\/p>\n<p>Licensed under the Apache License 2.0 \u2014 see LICENSE.<br \/>\nCopyright \u00a9 2026 VektorGeist LLC.<br \/>\n&#8220;Magpie Search&#8221; and the magpie mark are trademarks of VektorGeist LLC. The code<br \/>\nis open under Apache-2.0; the brand and name are reserved.<br \/>\n<br \/><br \/>\n<br \/><a href=\"https:\/\/github.com\/xfloukiex-lab\/magpie-search\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A federated search engine \u2014 the search engine an AI agent or LLM reaches for when it needs to find something true to reason over. Ever had your computer reboot on you, or a power outage hit mid-session? Every thread your agent was holding \u2014 gone. Now you have the tool to get it back. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":5925,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[676],"tags":[],"class_list":["post-5924","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-ai"],"_links":{"self":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/5924","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5924"}],"version-history":[{"count":0,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/5924\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/media\/5925"}],"wp:attachment":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5924"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5924"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5924"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}