Skip to content

How NOUZ Works

NOUZ reads YAML frontmatter, builds a DAG, classifies content through etalons, and proposes links between branches. You define the structure and make the decisions; AI helps compute, compare, and notice weak spots in the graph.

Entity Formula

Every node can be shown as a compact formula:

(children)[node]{parents}

( ) — children. Signs are aggregated; a number prefix appears only when count is greater than one.

[ ] — the node itself: its sign or artifact_sign.

{ } — parents. The formula stays compact even as plain text.

(2E)[E]{S}      — two E children, self E, parent S
(σE)[σE]{E}     — quant with artifact_sign σ and domain sign E
[β]             — artifact without parents or children

format_entity_compact returns this formula for any note. It is not a separate classification mechanism; it is a visual coordinate: what sits below the node, what sign the node has, and what structure it belongs to above.

Graph Context

NOUZ builds the graph top-down: from domains to artifacts. The graph defines explicit structure, while the semantic layer adds computed signals: domain, bridges, core_mix, and drift.

Through MCP, an agent can explicitly request a note's place in the graph:

  • parents and children;
  • level and sign;
  • compact formula (children)[node]{parents};
  • core_mix, when PRIZMA or SLOI is enabled.

That lets the agent work with a note as a node in the knowledge graph. Semantic calculations are separate: text is compared with etalons, bridges are found through embeddings, and core_mix shows how quant content changes the picture bottom-up.

Level 0 (meta_root)

Anchor node for the whole base. Set in config.yaml as meta_root: "My Knowledge Base". L1 cores can reference it, while the node itself is excluded from semantic calculations. In graph visualization, it is the center that holds domains in one system without affecting their content classification.

artifact_sign

Sign determined by content heuristics (logs, chats, configs). Used for L5 to separate material type from topic. For L4 it becomes part of a composite sign (artifact + core).

Sign and artifact_sign

NOUZ has two sign layers:

LevelHow It Is Determined
L1-L3Domain sign from etalons, unless manually set
L4 QuantComposite sign: artifact_sign from linked artifacts + content domain sign
L5 Artifactartifact_sign from content-structure heuristic, no domain sign

Manual markup has priority. If sign is already set in YAML, the server does not overwrite it as truth, but it can still compute sign_auto for comparison.

artifact_sign describes the material type: note, concept, reference, log, news, hypothesis, specification. For L4 it can be stored in YAML as part of a composite sign; for L5 it is stored in the database and displayed as the artifact sign.

Etalon Classification

Domains are defined in config.yaml as an etalons list. calibrate_cores turns those texts into reference vectors and stores them in SQLite.

During classification the server:

  1. Takes the content embedding after stripping HTML formulas.
  2. Compares it with mean-centered etalons.
  3. Computes spread: max_score - min_score.
  4. If spread < sign_spread (0.05), the difference between domains is too weak, so the server does not choose a domain.
  5. Otherwise converts scores to percentages:
adjusted = score - min_score
percent = adjusted / sum(adjusted) * 100

Every domain with percentage ≥ pattern_second_sign_threshold (30.0) enters the composite sign. The dominant domain is confident when its percentage is ≥ confident_spread (60.0).

Mean-Centering vs Anisotropy

Transformer embeddings have an awkward property: many texts look "somewhat similar" to each other even when their domains are different. Raw cosine can therefore be misleading.

The server handles this through _mean_center: before comparison, it subtracts the shared mean vector from the etalons. After that, NOUZ does not trust a single raw cosine. It looks at the gap between domains: how strongly one etalon wins over the others. That is why spread, percentages, and thresholds matter more than one similarity number.

core_mix and Drift

sign is intent: how a node is marked or classified. core_mix is the actual domain profile aggregated from lower-level content.

↓ sign: set manually or computed for one node
↑ core_mix: L4 → L3 → L2, aggregated bottom-up

If a module's sign says "Engineering" while core_mix increasingly points to "Systems Analysis", that is a drift signal: the module's content has moved away from its original frame.

TypeWho Creates ItMeaning
hierarchyUserMain structural link
temporaryUser or AITemporary link for material not yet settled in the graph
semanticAI proposesTexts from different domains share meaning
tagAI proposesSimilarity between tags or short concepts
analogyAI proposesSimilar graph role across different domains
errorServerStrict hierarchy violation in SLOI

The formula displays only hierarchy, semantic, and temporary links to stay readable. Other link types remain available in MCP data and in the index.

Bridges

Semantic bridges compare a whole note against notes from other domains. Default threshold: semantic_bridge_threshold = 0.55.

Tag bridges compare tags and short concepts. They can reveal shared concepts even when full texts are different.

Analogy bridges look for structural similarity: similar core_mix, level, degree, and tag overlap. Default threshold: structural_bridge_threshold = 0.55.

All bridges are returned as proposals (proposed: true). The server shows candidates; you decide what becomes a link.

suggest_parents, process_orphans, and add_entity can propose a place for a note without parents.

The server compares the note text with possible parents. Candidates below parent_link_threshold (0.55) are discarded. If several candidates are close, the parent from the same domain gets priority.

Pipeline

Note

Markdown with YAML: type, level, sign, parents, tags. Files without YAML are indexed too, but need markup.

Index

The server stores metadata, content, links, and embeddings in SQLite.

Classification

Content is compared with etalons; L5 receives artifact_sign, L4 can receive a composite sign.

Aggregation

L4 gets a profile from text classification; L3 and L2 aggregate child nodes bottom-up.

Proposals

Bridges, parents, tags, and hierarchy errors are returned as candidates for your decision.

Database

NOUZ stores its index in SQLite (obsidian_kb.db in the vault root):

  • file metadata;
  • graph links;
  • embeddings;
  • etalon reference vectors;
  • core_mix, sign_auto, sign_source, artifact_sign.

Notes and the database stay local. If you use a cloud embedding provider, only texts requested for embeddings leave your machine.

SQLite does not require a separate database server: Python includes sqlite3, and NOUZ uses aiosqlite as an async wrapper. The local index file is created when the vault is indexed.

{Semiotronika}
Telegram · Volnaya Sreda · Email