Hypernym makes world-model precision a substrate-engineering problem, not a model-engineering problem.
The industry is competing on model quality. Hypernym is competing on a different axis. Hypercore + Modulum + Magic together convert what was previously a stochastic property of model weights — recall, grounding, calibration, procedural execution, paraphrase consistency — into an architectural property of the inference system. The substrate is engineered; the model becomes the cheap, replaceable component.
11 rounds of compound-research ideation across 5 model panels (Codex · Claude · Gemini · Gemma · Grok) produced two unanimous architectural commits, six convergent primitive groups, seven unanimous primitive clusters, and a quantified hyperscaler economic thesis. R11 reverses R10's "half-correct" verdict on world-model precision: 5 of 8 failure modes are structurally fixed by the full system, 2 fixed for the configured-procedure subset, 1 with substrate-perimeter widening.
The industry is competing on model quality. Hypernym is competing on a different axis. The full system makes world-model precision a substrate-engineering problem rather than a model-engineering problem. That decouples "model quality" from "parameter count" and turns inference cost reduction into a $20–50B/year-per-customer story at major hyperscalers.
Composable. Independent. Powerful together.
Hypernym ships two platforms with distinct buyer surfaces. Hypercore is the comprehension layer — domain-grounded retrieval, agentic research, structured memory, runtime compression, provenance. Modulum is the inference + memory layer — drop-in inference optimization, effective infinite context, persistent expertise across sessions. Each is independently deployable; together they form the persistent-memory platform the industry has been missing.
- Hypercore Engine Full deployment configured to your corpus. 6-layer architecture: Intake · Workflows · Agent · Confidence · Consistency · Stream.
- Magic Runtime compression API + plugins for Claude Code / Codex / Devin. 30–60% raw speedup proven on SWEBench Verified · 87% context compression · 0/5 → 4/5 Opus 4.6 lift
- Omnifact API 60 stochastic trials, frequency-ranked semantic facts. Compresses long context into ranked, citable facts.
- HyperRemember API Embeddings + fact-based reranking. Long-running memory that doesn't drift.
- Drops in Any transformer. Llama / Qwen / Mistral / Phi / Gemma / MiniMax. Cross-architecture portable.
- Effective infinite context Cache intelligently recycles. Model never runs out of room across context lengths and sessions.
- Persistent expertise Load a domain once. Keep traced and ranked total recall, forever. Process restarts don't wipe state.
- No domain hallucination Vocabulary output restriction eliminates out-of-domain hallucinations entirely.
- Better unit economics Decode speedup that scales with context. Bigger models become cheaper at inference.
End-to-end persistent memory — the layer the industry has been missing.
Hypercore brings
- Structured memory with provenance
- Source-chain citations
- Confidence math per claim (
source_type × grounding × corroboration) - Audit-ready retrieval traces
Modulum brings
- Inference-time memory persistence
- Effective infinite context in fixed memory
- Domain expertise across session boundaries
- No retraining, no fine-tuning
The numbers behind the architecture.
Modulum is not a thesis; it is measured. 38 measurements across 3 corpora and 7 context lengths produced 38 improvements, 0 regressions, 0 speed cost. The 75%-noise observation is confirmed across 4 architectures from 4 different companies — "4 companies, 1 algebra."
11 rounds. Each compounds the last.
The R-family followed the shape: breadth → depth → breadth → reframe. Each round was a dispatch of 3–5 reasoning models running in parallel; each output ~14–80 KB; each round produced a synthesis MD and a deployable deck. Convergence detection across model panels is the architectural-commit signal.
Master proposal · Hypernym → Forge integration shape
Initial scoping. Identified PDS as candidate primitive. Established the dispatch pattern (3-5 model panel + manual synthesis). Foundation for everything that follows.
Visual exploration · architectural variants
First visual deck. Refined the architectural language. Tested cross-model panel patterns at smaller scale.
Wider primitive surface
Extended the primitive inventory. Began exploring world-model and inference-side framings.
Compression-first framing
Compressed-Repo-Analyze + Hypercore primitives for runtime use. Foundation for Magic's eventual standalone shipping.
Reality Substrate · Grounded World Kernel · Verifiable Causal Engine · Bicameral · Hypernym World Model
5-model panel produced 5 different names for the same convergent architecture: PDS as the unit of product. Foundation for R8's mechanism commit.
Direct products + Forge synergies + compound carry-forward
5/5 unanimous on Hypernym Vault. SectorPack (5/5). GroundedNotes (5/5). 16-r7-5 synthesis: 28-item buildability matrix.
Crafter v1 substrate-mounting MVP
5/5 unanimous on Crafter as the world-model MVP. 21 days, ~$40K, publication-grade falsifier. SWE-Bench Verified as 5/5 outlier follow-on.
Train-a-Model · 4/5 vote for Continued Pretrain B
$550K central / $800K ceiling, 12 weeks. Modulum-7B-Native via continued pretrain of Qwen 3.5-7B / Llama 3.1-8B with attention-modification objectives.
Attention-Mask Conditioning · 5/5 unanimous
The strongest architectural convergence in any round. Five names for one mechanism: PCHR (Claude) · MaskGate (Codex) · Modulum-SparseGate (Gemini) · SAS / DHTS (Gemma) · Domain-Specific Head Pruning (Grok). M5 = the commit.
What Modulum Unlocks · 6 convergent primitive groups · 33 net-new primitives
5/5 unanimous on Cognitive Gearing as the universal hyperscaler primitive. Causal Trace + Replay + Attestation as the audit infrastructure. Verifiable Domain Sealing as compliance. Substrate Composition as multi-agent. Portable Expert ABI as marketplace. Programmable Substrate as the wildest ISA-level claim.
Softmax-Level Breakthroughs · 7 unanimous clusters · climate civilizational pick
3/5 panel (Codex refused NDA legally; Gemma local failed). 7 clusters spanning horizontal verticals + vertical stack. World-model precision answer: "half-correct" — the framing R11 corrects. Climate modeling as 3/3 unanimous civilizational application.
Sum-of-All-Parts · R10's verdict reversed · hyperscaler $$$ quantified
4/4 panel (soft-ack worked; Codex now full participant with 114 KB output). 3 meaningful flips of R10's "not addressed" failure modes when the panel evaluates Hypercore + Modulum + Magic as a system. Hyperscaler economics quantified at $20–50B/year per major customer at midpoint efficiency. R12 candidates: Substrate-as-Asset (Claude, recommended), Ungrounded-Creativity (Gemini), Emergent Reasoning (Grok).
R10's verdict reversed. The full system fixes more than Modulum-alone could.
R10 evaluated Modulum in isolation and concluded "near-100% precision world models is half-correct — civilization-defining for one narrow class only." R11 evaluates the actual system (Hypercore + Modulum + Magic) and reverses the verdict on three failure modes. 5 of 8 modes structurally fixed; 2 fixed for configured-procedure subset; 1 with substrate-perimeter widening.
| Failure mode | R10 verdict | R11 corrected verdict | What addresses it (full system) |
|---|---|---|---|
| Hallucination of facts | Fixed | Confirmed fixed | Modulum vocabulary output restriction (hard mask on logits) + Hypercore Workflow DAG grounded prefix. |
| Lost-in-middle attention dilution | Fixed | Confirmed fixed | Effective infinite context, content-addressed not position-addressed. Catastrophic forgetting solved at inference time. |
| Inconsistency under paraphrase | Partial | FLIPPED → Fixed | Hypercore mechanical confidence math (source_type × grounding × corroboration) + Magic provenance-preserving compression. Paraphrases canonicalize to the same substrate fact under the same confidence triple. |
| Failure to update on new evidence | Fixed | Confirmed fixed | Substrate updates propagate without retraining; new corroboration shifts confidence scalar; new source_type re-ranks. |
| Errors in multi-step reasoning chains | Not fixed | FLIPPED → Fixed for DAG-anticipated | Pre-agent Workflow DAG executes A→B→C deterministically *before* the agent token-decodes. Stochastic-decode chains become deterministic-DAG-execution. Residual: novel chains outside DAG anticipation still fall to model reasoning. |
| OOD generalization beyond mounted facts | Not fixed | REFRAMED → Perimeter widened ~10× | Magic continuous canonicalization + HyperRemember semantic reranking + Omnifact 60-trial frequency ranking widen what counts as in-domain. Practical OOD failure rate drops by ~10× for configured domains. |
| Procedural / simulation knowledge | Not fixed | FLIPPED → Fixed for DAG-expressible | YAML domain config + Workflow DAG = procedural knowledge encoded as executable substrate. Osmium's 23-DB cross-reference graph with 21 parsers IS procedural — encodes how to compute biomedical answers, not just what. |
| Self-model / uncertainty about its own uncertainty | Not fixed | FLIPPED → Fixed at system level | Hypercore mechanical confidence per-claim is the system reporting its own uncertainty mechanically with math visible. Buyer queries the system, not the model. Model's internal self-model is irrelevant. |
For any domain expressible as a substrate plus a Workflow DAG plus a vocabulary mask, the full Hypernym system delivers world-model precision at five of eight failure axes structurally fixed, two fixed for the configured-procedure subset, one with substrate-perimeter widening sufficient to drop practical OOD failure by 10×. The civilization-defining claim is not narrow. It applies to every domain where the buyer can specify the domain.
What Hypernym ships, in five tiers.
Pulling together: 4 Hypercore products (shipping today) + 7 Modulum components (patent-track) + 6 internal platform features (May-2 roadmap) + 6 public product candidates + 3 trust-track use cases. The full surface, in one place.
Three deployments. Three proof points.
Hypercore is not a thesis; it is shipping. Three live deployments today across biomedical research, legal opinion analysis, and pure-API agent integration. Osmium is the flagship reference deployment with the demo Hypernym leads with.
- 23 public biomedical databases
- 312K entities resolved
- 842K cross-references
- 21 parsers
- 34/35 claims grounded in source
- 0.85 avg confidence (0.51 min · 0.98 max)
- 6 PMID citations validated against PubMed
- 21 parsers
- 35 opinion files
- 18,647 facts extracted
- ~2,200 curated
- 100% via Hypercore APIs
- External agent
- Pure infrastructure
- No frontend
The single largest economic primitive in the deck.
Public estimates put 2025 hyperscaler inference compute spend at $80–120B/year across major providers, growing to $200–400B by 2027. Cognitive Gearing + Confidence-Bound Speculative Decoding at the demonstrated 75% attention-noise reclamation translates to roughly 30–50% effective decode-cost reduction at fleet scale.
Strategic implication beyond cost: if the 3.04× decode speedup holds at long context, the impact is increased served demand per GPU fleet — not just compute saved, but capex deferred. Hyperscalers serve more inference traffic on existing infrastructure rather than building more.
This is the largest economic primitive in the deck by approximately one order of magnitude over every other line item. Vault, SectorPack, GroundedNotes, vertical agents — every other product in the inventory is one to two orders of magnitude smaller. The hyperscaler track must dominate near-term capital allocation.
Flagship · wrapping · impact · IP · timeline · kill criterion.
Each buyer class gets a flagship primitive shipped with concrete product wrapping, quantified impact, IP-protection strategy, 30/60/90/180-day timeline, and a falsifiable kill criterion.
IP protection
Binary-blob runtime + black-box ABI + calibration-as-service + VPC-bounded + joint chip exclusivity. Detection algorithm stays at Hypernym; only the configured runtime ships.
Strategic impact
Beyond cost: increased served demand per GPU fleet, capex deferral. $80–140B/year industry-wide by 2028.
IP protection
Per-domain isolation per-tenant; per-domain port + auth + session DB. Customer's data never leaves VPC.
Vertical sequence
Medical (Osmium ✓) → Legal (TrustFoundry ✓) → Energy → Insurance → Pharmaceutical → Finance.
IP protection
Component-level partition. Signed runtime artifacts that work on legitimate substrates only. The configured runtime ships; the configurator doesn't.
Distribution channel
Llama / Qwen / Mistral / Phi / Gemma user bases. Academic citation flywheel. Joint papers as proof.
IP protection
Magic algorithm closed; plugin interfaces open; canonicalization recipe patented separately.
Pricing
$10–50/seat/mo developer subs at scale.
IP protection
Personal substrates stay on-device. Only signed route traces (no raw substrate) ever leave the device. Hypernym never sees user data.
Primary moat
Data-sovereignty UX. Substrate sovereignty as architectural property, not legal aspiration.
IP protection
Co-built artifacts split per agreement. Hypernym retains universal-stack rights; partner retains domain-specific assembly.
Why climate first
Forcing-decomposition matches softmax-level strengths. Audit requirement is real and unmet. AI weather models lack provenance. Geopolitical stakes higher than legal/biomedical.
Six elements, panel-unanimous.
All four R11 panel models converged on the same six-element IP-protection strategy for shipping Modulum to hyperscalers and OSS communities without losing the patent moat. The key insight: ship the configured runtime, not the configurator.
What is impossible for everyone else and possible for Hypernym.
SpaceX-style fundamental-leap thinking. Two empirical claims with civilizational implications.
If the 75%-noise universality is architectural (which it appears to be), then the entire industry is stuck competing on a saturated axis (parameter count). Hypernym's axis (substrate engineering) is structurally unsaturated. The competition isn't a faster model; it's a different category of system.
What R11 is still missing — three different altitudes.
R11 corrected R10 on the system axis (full system vs Modulum-isolated) and on the primitive axis (non-PDS-derivative primitives). The three R12 candidates from the panel sit at three different altitudes — economic-strategic, epistemological, cognitive.
Every R7-R11 primitive assumes static substrate or human-engineering-pace updates. Real deployments accrete at machine pace (Osmium: 312K entities + 842K cross-references growing daily). Over 24 months, the substrate becomes higher-dimensional than any model. Failure mode inverts: the model becomes the cheap, replaceable component; the substrate becomes the irreplaceable, high-IP-value asset. Hypernym becomes a substrate company, not an inference company. Patent moat on Modulum components matters less than the data-moat-by-construction on the substrates accreted over 5+ years across regulated verticals. Scale jumps from $5B inference company to $50B+ substrate company.
R11 over-emphasizes reasoning over recorded knowledge. The Hypercore + Modulum stack is an unparalleled architecture for understanding, retrieving, and reasoning about what is already known. It is optimized for known-knowns and known-unknowns. The next great leap in AI may not be perfectly reasoning over the past, but generating truly novel un-groundable futures — new mathematical theorems, new artistic styles, new physical theories with no precedent in training data or substrate. The architecture optimized for grounding may, by its very nature, be structurally incapable of "ungrounded" leaps. The system's greatest strength — its refusal to hallucinate — might also be its greatest limitation.
Hypercore + Modulum + Magic address factual precision and inference persistence. They may not capture emergent reasoning or meta-cognitive adaptation — self-improving world models, models that develop new capabilities beyond what's mechanically engineered. The third axis beyond comprehension and memory.
R12 = Substrate-as-Asset (Axis A). Three reasons. One: only axis affecting current decisions — B and C are research questions; their answers don't change what gets built in the next 12 months. Two: closest to falsifiable in 90 days — map the substrate-accretion curves at Osmium / TrustFoundry / Amble; project 24-month volume; data already exists. Three: the only axis with $-scale implications already on the deck — the hyperscaler primitive becomes either "license fees" or "substrate-rent-extraction-rights across the regulated enterprise base." Different scale of company entirely. R13 picks up B once we know the company shape; R14 picks up C as the longest-horizon program.
Final framing for cofounders + R&D.
If R7 said "PDS is the unit of product," R8 said "M5 is the mechanism," R9 said "Modulum makes knowledge operational at a layer below code," R10 said "softmax-level operation is the architecture of audit-grade truth," then R11 says: the full system makes world-model precision a substrate-engineering problem rather than a model-engineering problem. Hypernym competes on a different axis than the entire industry is currently running on.
One — the reframe is the most important architectural finding in the entire R7-R11 family. Every prior round assumed Modulum was the protagonist. Full-system view shows Hypercore is the load-bearing component for 5 of 8 world-model failure modes; Modulum is the high-leverage but bounded inference layer; Magic is the runtime glue. The industry is competing on model quality; Hypernym is competing on a different axis entirely.
Two — hyperscaler economics is now quantified at $20–50B/year per major customer at midpoint efficiency. This is the single largest economic primitive in the deck. Every other line item — Vault, SectorPack, GroundedNotes, vertical agents — is one to two orders of magnitude smaller. The hyperscaler track must dominate near-term capital allocation.
Three — the substrate-as-asset axis (Claude's R12) reframes Hypernym at the company-scale. If the substrate-accretion thesis holds at Osmium-scale projection, Hypernym's 5-year shape is not "an inference vendor that licenses Modulum" — it is "a substrate company that owns the regulated-enterprise comprehension graph and rents access to it across model providers." Scale jumps from $5B to $50B+. Every contract written today either preserves or forecloses that future.
The R12 recommendation: substrate-as-asset at year-5 scale. Run the round in the next 14 days, before the hyperscaler pilots scale and the substrate-ownership architectural choices become harder to undo.