How NOUZ Works
NOUZ reads YAML frontmatter, builds a DAG, classifies content through etalons, and proposes links between branches. You define the structure and make the decisions; AI helps compute, compare, and notice weak spots in the graph.
Entity Formula
Every node can be shown as a compact formula:
(children)[node]{parents}( ) — children. Signs are aggregated; a number prefix appears only when count is greater than one.
[ ] — the node itself: its sign or artifact_sign.
{ } — parents. The formula stays compact even as plain text.
(2E)[E]{S} — two E children, self E, parent S
(σE)[σE]{E} — quant with artifact_sign σ and domain sign E
[β] — artifact without parents or childrenformat_entity_compact returns this formula for any note. It is not a separate classification mechanism; it is a visual coordinate: what sits below the node, what sign the node has, and what structure it belongs to above.
Graph Context
NOUZ builds the graph top-down: from domains to artifacts. The graph defines explicit structure, while the semantic layer adds computed signals: domain, bridges, core_mix, and drift.
Through MCP, an agent can explicitly request a note's place in the graph:
- parents and children;
- level and
sign; - compact formula
(children)[node]{parents}; core_mix, when PRIZMA or SLOI is enabled.
That lets the agent work with a note as a node in the knowledge graph. Semantic calculations are separate: text is compared with etalons, bridges are found through embeddings, and core_mix shows how quant content changes the picture bottom-up.
Level 0 (meta_root)
Anchor node for the whole base. Set in config.yaml as meta_root: "My Knowledge Base". L1 cores can reference it, while the node itself is excluded from semantic calculations. In graph visualization, it is the center that holds domains in one system without affecting their content classification.
artifact_sign
Sign determined by content heuristics (logs, chats, configs). Used for L5 to separate material type from topic. For L4 it becomes part of a composite sign (artifact + core).
Sign and artifact_sign
NOUZ has two sign layers:
| Level | How It Is Determined |
|---|---|
| L1-L3 | Domain sign from etalons, unless manually set |
| L4 Quant | Composite sign: artifact_sign from linked artifacts + content domain sign |
| L5 Artifact | artifact_sign from content-structure heuristic, no domain sign |
Manual markup has priority. If sign is already set in YAML, the server does not overwrite it as truth, but it can still compute sign_auto for comparison.
artifact_sign describes the material type: note, concept, reference, log, news, hypothesis, specification. For L4 it can be stored in YAML as part of a composite sign; for L5 it is stored in the database and displayed as the artifact sign.
Etalon Classification
Domains are defined in config.yaml as an etalons list. calibrate_cores turns those texts into reference vectors and stores them in SQLite.
During classification the server:
- Takes the content embedding after stripping HTML formulas.
- Compares it with mean-centered etalons.
- Computes spread:
max_score - min_score. - If
spread < sign_spread(0.05), the difference between domains is too weak, so the server does not choose a domain. - Otherwise converts scores to percentages:
adjusted = score - min_score
percent = adjusted / sum(adjusted) * 100Every domain with percentage ≥ pattern_second_sign_threshold (30.0) enters the composite sign. The dominant domain is confident when its percentage is ≥ confident_spread (60.0).
Mean-Centering vs Anisotropy
Transformer embeddings have an awkward property: many texts look "somewhat similar" to each other even when their domains are different. Raw cosine can therefore be misleading.
The server handles this through _mean_center: before comparison, it subtracts the shared mean vector from the etalons. After that, NOUZ does not trust a single raw cosine. It looks at the gap between domains: how strongly one etalon wins over the others. That is why spread, percentages, and thresholds matter more than one similarity number.
core_mix and Drift
sign is intent: how a node is marked or classified. core_mix is the actual domain profile aggregated from lower-level content.
↑ core_mix: L4 → L3 → L2, aggregated bottom-up
If a module's sign says "Engineering" while core_mix increasingly points to "Systems Analysis", that is a drift signal: the module's content has moved away from its original frame.
Link Types
| Type | Who Creates It | Meaning |
|---|---|---|
hierarchy | User | Main structural link |
temporary | User or AI | Temporary link for material not yet settled in the graph |
semantic | AI proposes | Texts from different domains share meaning |
tag | AI proposes | Similarity between tags or short concepts |
analogy | AI proposes | Similar graph role across different domains |
error | Server | Strict hierarchy violation in SLOI |
The formula displays only hierarchy, semantic, and temporary links to stay readable. Other link types remain available in MCP data and in the index.
Bridges
Semantic bridges compare a whole note against notes from other domains. Default threshold: semantic_bridge_threshold = 0.55.
Tag bridges compare tags and short concepts. They can reveal shared concepts even when full texts are different.
Analogy bridges look for structural similarity: similar core_mix, level, degree, and tag overlap. Default threshold: structural_bridge_threshold = 0.55.
All bridges are returned as proposals (proposed: true). The server shows candidates; you decide what becomes a link.
Automatic Parent Search
suggest_parents, process_orphans, and add_entity can propose a place for a note without parents.
The server compares the note text with possible parents. Candidates below parent_link_threshold (0.55) are discarded. If several candidates are close, the parent from the same domain gets priority.
Pipeline
Note
Markdown with YAML: type, level, sign, parents, tags. Files without YAML are indexed too, but need markup.
Index
The server stores metadata, content, links, and embeddings in SQLite.
Classification
Content is compared with etalons; L5 receives artifact_sign, L4 can receive a composite sign.
Aggregation
L4 gets a profile from text classification; L3 and L2 aggregate child nodes bottom-up.
Proposals
Bridges, parents, tags, and hierarchy errors are returned as candidates for your decision.
Database
NOUZ stores its index in SQLite (obsidian_kb.db in the vault root):
- file metadata;
- graph links;
- embeddings;
- etalon reference vectors;
core_mix,sign_auto,sign_source,artifact_sign.
Notes and the database stay local. If you use a cloud embedding provider, only texts requested for embeddings leave your machine.
SQLite does not require a separate database server: Python includes sqlite3, and NOUZ uses aiosqlite as an async wrapper. The local index file is created when the vault is indexed.