17 September 2025

Search Is Three Different Things

By Asgeir Albretsen5 min read

searchknowledge-managementretrievalai

Try searching for a fact in your notes app. Not a keyword — a fact. Something like "what did Sarah say she prefers for onboarding new clients?" or "what was the pricing decision we made in March?"

A good keyword search will find every document where "Sarah" and "onboarding" appear on the same page. That's often four or five documents. You still have to read them all. The search worked, technically. It just didn't find what you were looking for.

This isn't a failure of search implementation. It's a category error.

Three primitives, usually sold as one

Full-text search finds where words appear. Type "SQLite" and it surfaces every document containing that string. Fast, deterministic, and brutally literal. If you wrote "database" instead, it finds nothing.

Semantic search — the vector kind — finds meaning. It converts your query and your documents into high-dimensional vectors, then measures distance between them. "Database" and "SQLite" land close to each other in that space. So do "meeting" and "discussion," "decision" and "conclusion." It finds things you'd recognize as relevant even when the exact words don't match. It's slower and fuzzier, but the fuzziness is the point.

Structured queries find facts. Not meaning — specific facts stored as typed fields. "People where name = Sarah AND preference.topic = onboarding." This doesn't search documents at all. It queries a database.

Most tools give you one of these. A few give you two. Almost none give you all three — and almost none explain which one you're using when you press the search button.

The failure mode of each is specific and predictable

Vannevar Bush described the core problem in 1945, in an essay called "As We May Think." The memex he imagined — a microfilm desk you could annotate and cross-reference — was designed around associative retrieval, not indexing. He watched scientists drowning in literature they couldn't connect. "The human mind," he wrote, "does not work that way. It operates by association."

He was describing semantic search eighty years before the technology existed.

But association doesn't help when you need precision. If you want the exact section number of a contract clause, associated meanings are noise. You want a string match. And if you want every task assigned to a specific project, you don't want either — you want a structured query over a properly relational table.

Research on vector search shows the degradation is real: accuracy drops sharply as the number of entities in a query increases. Ask about a single concept and it works well. Ask about relationships between five specific things and you start getting plausible-sounding hallucinations. Full-text search is precise but brittle — change one word, miss the document entirely. Structured search is perfect for what it covers, but it only covers what you thought to model explicitly.

BM25, the algorithm underlying most full-text search today, dates to work done in the 1970s and was formally described by Stephen Robertson and Karen Spärck Jones in the mid-1990s. It predates the web. Vector embeddings, by contrast, only became tractable for semantic retrieval after the transformer architecture appeared in 2017. These aren't just different versions of the same idea — they were developed for entirely different problems, by different communities, decades apart.

What changes when you have all three

When you search a personal knowledge base, you're never actually doing just one of these things. "What did we decide about pricing for enterprise customers?" needs semantic search to find the relevant documents, full-text search to surface the specific phrasing you used, and probably a structured query to surface the linked task or project.

The tools that call their search bar "AI-powered" have usually picked one approach and pointed it at your documents. The interesting systems combine all three at query time, fuse the results, and surface ranked excerpts with their origins visible.

Harbor's retrieval layer works this way: SQLite FTS for exact matches, vector embeddings for semantic proximity, SQL queries over structured entities. When the AI calls knowledge.search, it isn't just running a keyword query. The results from three different retrieval paths get merged and ranked together. Ask what someone prefers and the answer comes from a structured query. Ask what you were thinking before a decision and the answer comes from semantic retrieval. Search for a specific phrase and FTS gets you there in milliseconds.

The complexity is in the plumbing, not in the experience. Or it should be. The visible thing is whether you get the right answer when you ask a question you know the answer is somewhere in your notes.

Most tools make you think about which layer you're searching. The better ones make it irrelevant.

Asgeir Albretsen is the founder of Harbor.

← All posts