Semantic Laboratory
geometry in language models
English version in progress
Coming next
The English lab is being prepared with its own data.
The current laboratory works with real embeddings and Russian vocabularies. A proper version needs its own word lists, recalculated vectors, checked nearest neighbors, and the same careful projection notes.
Real vectorsEnglish datasets will be generated from embedding models.
Own vocabulariesWord sets, neighbors, and clusters need to fit English language structure.
Same careThe 3D view will remain a projection, with full-space cosine neighbors shown separately.