The Phantom | Eric Becker

What It Is

Local AI language models are extraordinary reasoning engines. They cannot search the web, remember last Tuesday, fetch a box score, or save a file. They are a genius locked in a room with no windows and no doors — answering questions slipped under the door on pieces of paper.

The Phantom opens the windows. It is a Python wrapper around Ollama that intercepts every conversation turn, performs whatever research or data retrieval is needed before the model sees the prompt, injects the results as context, and handles all post-processing after the model responds.

"The model does one thing: it reasons. The Phantom does everything else."

What It Does — Before the Model Sees Your Prompt

The Phantom performs all of the following before passing a prompt to the local model:

Web Intelligence

Scrapes full article content (not snippets) via Trafilatura with BeautifulSoup fallback. Attempts paywall bypass via 12ft.io cascade. DuckDuckGo search returning full article text.

Live Data APIs

MLB Stats API integration (no API key required) for live scores and full season statistics with correct at-bat vs plate appearance distinction. RSS feed reading from configured sources. YouTube transcript fetching without API key.

Semantic Memory

ChromaDB vector database running locally. Every conversation turn stored as a mathematical vector. Before each turn, the five most semantically similar past turns retrieved — not by keyword, but by meaning. How human memory actually works.

The Result

The model reads web content, live data, and relevant past conversations as if it simply knew these things. It did not simply know them. The Phantom knew where to look.

Architecture

Tier 1 — Core Agent Engine

Connects to any local Ollama model at localhost:11434. Maintains conversation history in JSON. Multi-project isolation with separate history, library, and workspace per project. Identity file system for personality and context definition.

Tier 2 — Web Intelligence

Two-stage detection: keyword triggers plus LLM self-query ("do I need the web for this?"). Catches queries that keyword matching would miss — "How has Miguel Rojas performed this season?" has no trigger keyword but the model correctly identifies the need.

Tier 3 — Background Scout

phantom_scout.py runs independently on a schedule, fetches fresh data, asks a local model to write content, renders self-refreshing HTML. Powers phantom.fluidfortune.com — live, AI-written content with no cloud dependency and no ongoing cost.

Tier 4 — Vector Memory

ChromaDB entirely local. Retrieves by semantic similarity, not keyword match. Auto-distillation keeps 20 most recent turns verbatim, summarizes older turns into memory snapshots. Context window stays sharp indefinitely.

The Phantom as Ecosystem Core

The Phantom was originally conceived as a standalone AI assistant. It became Tier 0 of the Pisces Moon ecosystem — the data preparation layer the whole platform was always waiting for. Wardrive CSVs from the T-Deck Plus feed directly into The Phantom's analysis pipeline. The baseball app on the Linux tablet queries The Phantom's MLB endpoint instead of cloud AI. The data formats are the same by design.

Technical Stack

PythonOllamaChromaDBFastAPITrafilaturaBeautifulSoupDuckDuckGo SearchTailscaleDeepSeekGemmaLlamaQwen