{"id":6259,"date":"2026-06-28T20:58:39","date_gmt":"2026-06-28T13:58:39","guid":{"rendered":"https:\/\/daiilynews.cu.ma\/?p=6259"},"modified":"2026-06-28T20:58:39","modified_gmt":"2026-06-28T13:58:39","slug":"thefullnacho-hestia-local-first-self-hosted-home-records-assistant-one-llm-brain-scoped-tools-durable-memory-agpl-3-0-%c2%b7-github","status":"publish","type":"post","link":"https:\/\/daiilynews.cu.ma\/?p=6259","title":{"rendered":"thefullnacho\/hestia: Local-first, self-hosted home &#038; records assistant: one LLM brain, scoped tools, durable memory. AGPL-3.0. \u00b7 GitHub"},"content":{"rendered":"<p> <br \/>\n<br \/>\nA local-first, self-hosted assistant for your home. One stateful &#8220;brain&#8221; runs a local LLM on<br \/>\nhardware you own, and every window into it \u2014 your phone, a terminal, the kitchen mic, Home<br \/>\nAssistant \u2014 talks to that same brain. Nothing runs in the cloud, nothing is exposed to the<br \/>\ninternet, and your data never leaves the house.<br \/>\nThe idea it&#8217;s built on. Most &#8220;AI for the home&#8221; points the model at the things it&#8217;s worst<br \/>\nat: remembering a schedule, watching a threshold, firing a reminder at the right minute. Hestia<br \/>\ndoes the opposite. Anything deterministic \u2014 a chore is due, the soil is dry, the trash goes out<br \/>\nTuesday \u2014 is handed to something dumb and reliable: a timer, a record, a row in a database. The<br \/>\nLLM is left to do the one thing it&#8217;s genuinely good at, which is judgment and conversation. The<br \/>\ngoal was never a smarter brain. It&#8217;s a more reliable one. (ARCHITECTURE.md is<br \/>\nthe long version; MEMORY-DESIGN.md covers the memory plan.)<br \/>\nWhat it actually is.<\/p>\n<p>A brain (brain\/) \u2014 an OpenAI-compatible endpoint (POST \/v1\/chat\/completions) wrapping a<br \/>\nlocal LLM (Ollama, qwen3:14b) with an agent loop. Every client speaks one dialect.<br \/>\nEight scoped tools \u2014 home (control Home Assistant), media (Plex + *arr), memory,<br \/>\nrecords, reminder, search, status, weather. There is deliberately no shell tool:<br \/>\nthe brain can act in your house but cannot run arbitrary commands.<br \/>\nMemory that grows \u2014 markdown soft-facts plus a SQLite record of the things in your life<br \/>\n(pets, garden, wildlife, chores), and a background note-taker that proposes durable facts for<br \/>\nyou to approve rather than writing them silently.<br \/>\nA media appliance \u2014 Plex + the *arr stack + Bazarr subtitles + qBittorrent behind a<br \/>\nfail-closed VPN kill-switch.<br \/>\nVoice \u2014 talk to it through Home Assistant&#8217;s Assist pipeline or the browser.<\/p>\n<p>What it isn&#8217;t. A cloud service, a wrapper around someone else&#8217;s API, or anything you should put<br \/>\non the public internet. It runs rootless on your own box and never phones home.<\/p>\n<p>\u26a0\ufe0f Read SECURITY.md before running it. The brain has no built-in<br \/>\nauthentication and can control your devices, so it must stay on a private network (Tailscale or<br \/>\nLAN). That&#8217;s a deliberate trade-off, not an oversight \u2014 the doc explains the trust model.<\/p>\n<p>Hestia is part of the Forager \/ Homesteader Labs constellation, alongside forager_ml,<br \/>\nforager-field-station, and the Homesteader Labs site.<\/p>\n<p>Phase 0 \u2014 Reach + brain \u2705 \u2014 talk to your home model from your phone (details below).<br \/>\nPhase 1 \u2014 Media appliance \u2705 \u2014 Plex + qBittorrent + gluetun VPN kill-switch (verified) + the<br \/>\n*arr automation layer (Prowlarr\/Sonarr\/Radarr + FlareSolverr + Bazarr subtitles). Full loop:<br \/>\nsearch \u2192 download (via VPN) \u2192 hardlink \u2192 Plex.<br \/>\nPhase 2 \u2014 House (Home Assistant) \u2705 \u2014 HA running; lights and devices reachable via the home tool.<br \/>\nPhase 3 \u2014 Voice \u2705 \u2014 speak to Hestia through HA&#8217;s Assist pipeline and a browser voice loop.<br \/>\nPhase 4 \u2014 The seam (memory + tools) \u2705 core in place, still growing \u2014 the brain is a<br \/>\ntool-calling agent with the eight tools above plus deterministic skill injection, and HA&#8217;s<br \/>\nconversation agent points at Hestia, so Assist and voice route through the brain (which can<br \/>\ncontrol HA back). It also gets smarter over time via the note-taker (see Memory &#038; learning).<br \/>\nNext: vision (Eyes).<\/p>\n<p>Phase 0 \u2014 Reach + brain<\/p>\n<p>Win: talk to your home model from your phone.<\/p>\n<p>The brain (brain\/) is a thin OpenAI-compatible proxy onto Ollama. Every client \u2014<br \/>\nterminal, phone, kitchen mic \u2014 speaks one dialect (POST \/v1\/chat\/completions).<br \/>\nIn Phase 0 it forces the chosen model, injects Hestia&#8217;s system prompt (persona +<br \/>\nthe hardened safety rules from the benchmark A\/B), and streams the reply back.<br \/>\nMemory and tools land in Phase 4 behind this same URL.<\/p>\n<p>Service<br \/>\nWhat<br \/>\nBind<br \/>\nGPU<\/p>\n<p>hestia-ollama<br \/>\nOllama inference engine<br \/>\n127.0.0.1:11434 (localhost only)<br \/>\nRTX 5080 only<\/p>\n<p>hestia-brain<br \/>\nHestia \/v1 proxy<br \/>\n0.0.0.0:8730 (reachable over Tailscale)<br \/>\n\u2014<\/p>\n<p>Both are user systemd services (no root), defined in deploy\/systemd\/ and<br \/>\ninstalled into ~\/.config\/systemd\/user\/. Linger is enabled, so they survive<br \/>\nlogout\/reboot. Ollama is pinned to the 5080 (CUDA_VISIBLE_DEVICES), leaving the<br \/>\n4060 Ti free for Phase 3 (Whisper\/Piper) per the benchmark verdict.<br \/>\nModel: qwen3:14b (resident, thinking off) \u2014 the current pick after the model eval<br \/>\n(brain\/eval_models.py; qwen2.5:14b kept on disk as a fallback). See MODEL_EVAL.md.<\/p>\n<p>Day to day, use deploy\/hestiactl (symlinked into ~\/.local\/bin) \u2014 one command<br \/>\nfor the whole estate, run from the GPU box:<br \/>\nhestiactl status              # brain health + local units + every container on hl-relay<br \/>\nhestiactl health              # raw \/health JSON<br \/>\nhestiactl up|down|restart X   # X: brain ollama | arr services | plex qbit ha adguard &#8230; | all<br \/>\nhestiactl logs X (-f)         # journalctl (local) or docker logs (remote)<br \/>\nhestiactl vpn                 # verify the qBittorrent kill-switch<br \/>\nall covers only the Hestia-managed pieces (local units + arr stack); core<br \/>\ncontainers (AdGuard = house DNS, gluetun, HA) are controlled one at a time and<br \/>\nask for confirmation before stopping.<br \/>\nThe underlying commands, for when you need them directly:<br \/>\n# status \/ logs<br \/>\nsystemctl &#8211;user status hestia-ollama hestia-brain<br \/>\njournalctl &#8211;user -u hestia-brain -f<\/p>\n<p># restart after editing brain code or a service file<br \/>\nsystemctl &#8211;user daemon-reload          # only if you edited a .service<br \/>\nsystemctl &#8211;user restart hestia-brain<\/p>\n<p># health (Ollama up + model present?) \u2014 brain binds the Tailscale IP, not localhost<br \/>\ncurl -s 127.0.0.1:8730\/health | jq<\/p>\n<p># talk to it<br \/>\ncurl -s 127.0.0.1:8730\/v1\/chat\/completions -H &#8216;content-type: application\/json&#8217; \\<br \/>\n  -d &#8216;{&#8220;messages&#8221;:({&#8220;role&#8221;:&#8221;user&#8221;,&#8221;content&#8221;:&#8221;hello Hestia&#8221;})}&#8217; | jq -r .choices(0).message.content<br \/>\nIf you edit a deploy\/systemd\/*.service file, re-copy it into<br \/>\n~\/.config\/systemd\/user\/ before daemon-reload.<br \/>\nReach it from the phone (Tailscale)<br \/>\nTailscale is the one piece that needs root, so it isn&#8217;t auto-installed. On the GPU<br \/>\nbox:<br \/>\ncurl -fsSL https:\/\/tailscale.com\/install.sh | sh<br \/>\nsudo tailscale up<br \/>\nThen on the phone: install the Tailscale app, sign in to the same tailnet. The<br \/>\nbrain is then reachable at http:\/\/:8730\/v1 from any app<br \/>\nthat speaks OpenAI (set that as the base URL; any API key string works \u2014 Ollama<br \/>\nignores it). Nothing is exposed to the public internet.<\/p>\n<p>brain\/<br \/>\n  hestia.py       # the agent loop: \/v1\/chat\/completions + \/health, tools, memory, note-taker hook<br \/>\n  config.py       # single source of paths + secret loading; makes the brain relocatable<br \/>\n  prompt.py       # SYSTEM_PROMPT \u2014 persona + hardened safety rules<br \/>\n  records_store.py \/ memory_store.py   # SQLite entities+events \/ markdown soft facts<br \/>\n  note_taker.py   # background &#8220;gets smarter over time&#8221; extractor<br \/>\n  review_notes.py # CLI to review + promote the note-taker&#8217;s proposals<br \/>\n  tools\/          # home, media, memory, records, reminder, search, status, weather (+ skill router)<br \/>\n  tests\/          # pytest: stores, dispatch, note-taker (run: uv run &#8211;project brain pytest)<br \/>\n  pyproject.toml  # deps + dev (pytest) + pytest config (uv-managed, isolated venv)<\/p>\n<p>Relocatable. Every path derives from config.py&#8217;s own location, so moving or<br \/>\nrestoring the repo to a new path needs no edits; HESTIA_ROOT overrides if needed. All<br \/>\nservice URLs, tokens, and thresholds stay env-overridable next to the tools that use them.<\/p>\n<p>Phase 1 \u2014 Media appliance (Dell Micro = hl-relay)<\/p>\n<p>Win: the media stack runs, independent of the brain.<\/p>\n<p>Most of this already existed on the Micro before Hestia: Plex (hl-plex),<br \/>\nqBittorrent behind gluetun (Surfshark, OpenVPN, NL) with a fail-closed VPN<br \/>\nkill-switch, plus AdGuard, MQTT, and Home Assistant. The kill-switch is verified:<br \/>\nqBittorrent&#8217;s traffic egresses via the VPN datacenter IP, not the host&#8217;s. Don&#8217;t<br \/>\ndocker compose up the existing \/opt\/home\/compose.yml blindly \u2014 its volume paths<br \/>\nare literal \/path\/to\/&#8230; host dirs that the running containers depend on.<br \/>\nHestia added the missing automation layer as a separate, isolated stack<br \/>\n(deploy\/media\/compose.yml, deployed to \/opt\/home\/arr\/): Prowlarr (:9696,<br \/>\nindexer manager), Sonarr (:8989, TV), Radarr (:7878, movies). All reachable<br \/>\nover Tailscale.<br \/>\nAlso added FlareSolverr (:8191) so Prowlarr can reach Cloudflare-protected<br \/>\nindexers, wired as a Prowlarr indexer-proxy (tag flaresolverr).<br \/>\nWired via API: root folders point at the existing Plex library<br \/>\n(\/data\/TV Shows, \/data\/Movies); a remote-path mapping (\/downloads \u2192<br \/>\n\/data\/downloads) lets Sonarr\/Radarr hardlink from qBittorrent&#8217;s downloads into<br \/>\nthe library (instant, no copy \u2014 both are one filesystem under \/mnt\/media); Prowlarr<br \/>\nis connected to Sonarr + Radarr (fullSync). Five reputable public indexers added<br \/>\n(The Pirate Bay, Knaben, LimeTorrents, plus 1337x + EZTV via FlareSolverr) and synced<br \/>\ndown to the apps. YTS deliberately excluded (history of feeding user data to copyright trolls).<br \/>\nqBittorrent is wired as the download client in both Sonarr (category tv-sonarr)<br \/>\nand Radarr (radarr), tested OK. The full loop works: search \u2192 download through the<br \/>\nVPN \u2192 hardlink into the Plex library. Both apps report no health warnings.<br \/>\n\u26a0\ufe0f Media currently lives on the Micro&#8217;s 98 GB root disk (~66 GB free). Fine to start;<br \/>\nplan a dedicated disk or NAS before the library grows.<\/p>\n<p>cd \/opt\/home\/arr<br \/>\ndocker compose ps<br \/>\ndocker compose pull &#038;&#038; docker compose up -d   # update *arr<\/p>\n<p>Phase 4 \u2014 the seam: HA conversation agent \u2192 Hestia<br \/>\ndeploy\/ha\/custom_components\/hestia\/ is a thin custom HA integration: it registers a<br \/>\nconversation agent (conversation.hestia) that forwards each utterance to Hestia&#8217;s<br \/>\n\/v1 and speaks the reply. Hestia owns the loop (memory + tools, incl. controlling<br \/>\nHA back); HA is just input + a tool. This is the architecture&#8217;s keystone made real.<br \/>\nWiring on hl-relay (not in this repo \u2014 lives in HA&#8217;s config):<\/p>\n<p>Integration files installed to \/opt\/home\/ha_config\/custom_components\/hestia\/.<br \/>\nA config entry points it at http:\/\/127.0.0.1:8730\/v1\/chat\/completions (Hestia<br \/>\nover Tailscale; the HA container can reach it).<br \/>\nThe preferred Assist pipeline&#8217;s conversation_engine is set to conversation.hestia,<br \/>\nso the Assist chat and voice satellites route through the brain.<\/p>\n<p>Verified: via HA&#8217;s conversation API, &#8220;turn on the TV light&#8221; drove the real light and<br \/>\n&#8220;what coffee should I buy?&#8221; recalled a memory \u2014 HA \u2192 Hestia \u2192 HA round trip.<br \/>\nMemory &#038; learning \u2014 it gets smarter over time<br \/>\nTwo stores back the brain: memory_store (markdown soft facts\/preferences, git-auditable)<br \/>\nand records_store (SQLite entities + a uniform event log: pets\/lineage, wildlife, chores,<br \/>\nservice reminders, the garden). Both are injected into the system prompt per request, scoped<br \/>\nto what the request implies.<br \/>\nThe brain also learns passively. After each exchange \u2014 once the answer is already on the<br \/>\nwire \u2014 a background note-taker (note_taker.py) reads the turn and proposes durable<br \/>\nfacts it heard (&#8220;trash pickup is Tuesday mornings&#8221;). True to propose, don&#8217;t dispose, those<br \/>\nland in a review inbox (memory\/inbox\/), not straight into live memory:<br \/>\nuv run &#8211;project brain python brain\/review_notes.py list<br \/>\nuv run &#8211;project brain python brain\/review_notes.py promote id> | &#8211;all<br \/>\nuv run &#8211;project brain python brain\/review_notes.py discard id> | &#8211;all<br \/>\nIt reuses the resident model by default and never blocks or breaks a request. Tuning knobs:<br \/>\nHESTIA_NOTETAKER=0 disables it; HESTIA_NOTETAKER_AUTOWRITE=1 skips the review queue and<br \/>\nwrites durable memories directly; HESTIA_NOTETAKER_MODEL points it at a cheaper model (e.g.<br \/>\na second Ollama on the free 4060 Ti) to take the load off the brain.<\/p>\n<p>Hestia is licensed under the GNU Affero General Public License v3.0 \u2014 see LICENSE.<br \/>\nThe AGPL is deliberate: Hestia is built to be self-hosted, so the copyleft keeps it open even for<br \/>\nanyone who runs a modified version as a network service, while imposing nothing on you for running<br \/>\nit at home.<br \/>\nBefore running it, read SECURITY.md: the brain has no built-in authentication<br \/>\nand can control your Home Assistant devices, so it must stay on a private network (Tailscale\/LAN)<br \/>\nand must never be exposed to the public internet. It deliberately has no shell tool.<br \/>\n\u00a9 2026 TheFullNacho and contributors.<br \/>\n<br \/><br \/>\n<br \/><a href=\"https:\/\/github.com\/thefullnacho\/hestia\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A local-first, self-hosted assistant for your home. One stateful &#8220;brain&#8221; runs a local LLM on hardware you own, and every window into it \u2014 your phone, a terminal, the kitchen mic, Home Assistant \u2014 talks to that same brain. Nothing runs in the cloud, nothing is exposed to the internet, and your data never leaves [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6260,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[676],"tags":[],"class_list":["post-6259","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-ai"],"_links":{"self":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/6259","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6259"}],"version-history":[{"count":0,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/6259\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/media\/6260"}],"wp:attachment":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6259"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6259"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6259"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}