Q-PNA Research Specification v2.0
title: "Q-PNA: Quantum-Native p-Adic Neural Architecture — Research Specification v2.0"
authors: "Rowan Brad Quni-Gudzinas"
date: "2026-05-19"
doi: "10.5281/zenodo.20287742"
version: "v2.0"
abstract: >
Q-PNA is a neural network architecture that replaces continuous embedding spaces
with ultrametric geometry on Bruhat-Tits trees, providing glass-box AI decisions
with formal verifiability via syntactic token calculus and ultrametric attention.
keywords: ["Q-PNA", "p-adic neural network", "ultrametric geometry", "Bruhat-Tits tree", "glass-box AI", "token calculus", "quantum computing", "explainable AI"]
license: "CC-BY-4.0"
modified: 2026-05-19T23:08:07Z
Research Specification v2.0
Author: Rowan Brad Quni-Gudzinas
ORCID: 0009-0002-4317-5604
Repository: github.com/QNFO/Q-PNA
Date: 2026-05-19
Abstract: Q-PNA is a neural network architecture that replaces the continuous embedding spaces of standard deep learning with ultrametric geometry on Bruhat-Tits trees. The architecture consists of four integrated components: p-adic valuation encoding, ultrametric attention with zero learned parameters, tree-walk optimization (a discrete analog of backpropagation), and a syntactic token calculus providing formal verification of every decision. The architecture shares its mathematical foundation with the QWAV quantum computing framework, enabling a unified computing paradigm where the geometry that passively suppresses quantum errors also produces glass-box AI decisions with fully traceable audit trails. This research specification is grounded in prior work including a working proof-of-concept demonstration, a book-length token calculus treatment, and computationally verified cophenetic distance theory, but the full architecture has not yet been computationally validated.
1. Motivation: The Gauge Problem in Continuous AI
1.1 The Archimedean Assumption
Contemporary AI rests on an invisible mathematical assumption: that the space of representations is continuous, smooth, and Archimedean (the real numbers $\mathbb{R}^n$). Every gradient descent step, every backpropagated weight update, every attention score in a transformer, and every latent vector in a variational autoencoder operates within the familiar continuum [ARCHIVE: Ultrametric Intelligence].
This foundation has carried the field remarkably far. Yet it imposes structural limitations that are consequences of the geometry, not of insufficient scale or data:
1.2 Gauge Dependence
[ARCHIVE: Auditable Attention PoC]
Embeddings in $\mathbb{R}^d$ are gauge-dependent: if $\phi: \mathbb{R}^d \to \mathbb{R}^d$ is any diffeomorphism, the embedding $v \in \mathbb{R}^d$ for concept $C$ can be replaced by $\phi(v)$ without changing intrinsic meaning. Yet neural networks treat these as distinct inputs. Formally:
Definition (Gauge dependence): A representation scheme $R: \mathcal{C} \to \mathbb{R}^d$ mapping concepts to vectors is gauge-dependent if there exists a diffeomorphism $\phi$ such that training converges to different parameters when trained on $\{R(c)\}$ vs. $\{\phi(R(c))\}$, even though $\phi$ preserves all relational structure.
This leads to:
- Adversarial fragility: Infinitesimal perturbations exploit the continuous corridors in the decision landscape
- Opaque decision-making: The path from input to output is a composition of nonlinearities with no geometric interpretation
- Poor analogical reasoning: Similarity in $\mathbb{R}^d$ (cosine, Euclidean) does not correspond to hierarchical similarity
1.3 The Ultrametric Alternative
Ultrametric spaces replace the Archimedean triangle inequality:
with the strong triangle inequality:
In an ultrametric space, every triangle is isosceles with the two longest sides equal — a property called triadic rigidity [DOI: 10.5281/zenodo.20213043]. All balls are clopen (both open and closed). Points are organized in a strict nested hierarchy rather than a continuous manifold.
The natural geometry of ultrametric spaces is a rooted tree. The Bruhat–Tits tree $\mathcal{T}_p$ for prime $p$ is the universal ultrametric space: an infinite regular tree with branching factor $p+1$, where the ultrametric distance between two leaves is the $p$-adic valuation of their difference.
2. Mathematical Foundations
2.1 The Bruhat–Tits Tree $\mathcal{T}_p$
[QWAV-INTERNAL: Technical Deep-Dive §4]
For a prime $p$, the Bruhat–Tits tree $\mathcal{T}_p$ is the infinite regular tree where each vertex has exactly $p+1$ neighbors. The boundary of $\mathcal{T}_p$ is the projective line over the $p$-adic numbers $\mathbb{P}^1(\mathbb{Q}_p)$.
Key properties:
- Regularity: Every vertex has degree $p+1$
- Ultrametric: Distances satisfy the strong triangle inequality
- Group action: $\operatorname{PGL}(2, \mathbb{Q}_p)$ acts transitively on the vertices
- Langlands connection: The tree encodes representations of the $p$-adic group
For computation, we truncate $\mathcal{T}_p$ to finite depth $D$:
| Parameter | Symbol | Typical Value | Role |
|---|---|---|---|
| :---------- | :------- | :-------------- | :----- |
| Prime | $p$ | 3 | Branching factor $p+1 = 4$ (ternary tree) |
| Depth | $D$ | 5–10 | Number of levels |
| Leaf count | $L$ | $(p+1) \cdot p^{D-1}$ | For $p=3$, $D=5$: $4 \cdot 3^4 = 324$ leaves |
| Semantic primes | $k$ | 5–12 | Number of prime dimensions for encoding |
2.2 Cophenetic Distance
[DOI: 10.5281/zenodo.20213043]
For leaves $a, b$ in a rooted tree with height function $h$, the cophenetic distance is the height of their lowest common ancestor (LCA):
Theorem (Cophenetic ultrametric): Cophenetic distance satisfies the strong triangle inequality:
Theorem (Triadic rigidity): For any three leaves, the two largest pairwise cophenetic distances are equal. [DOI: 10.5281/zenodo.20213043]
The cophenetic distance IS the ultrametric distance on the Bruhat–Tits tree. The distinction between “Bruhat–Tits trees” and “cophenetic trees” is a framing choice — they are the same mathematical structure. The tree is Bruhat–Tits; the distance function is cophenetic. [LLM-INFERRED — see q-pna/0.1.md §4 for full argument]
2.3 $p$-Adic Valuation Encoding
[ARCHIVE: ultrametric-ai-poc]
The bridge from tokens to tree leaves is the $p$-adic valuation. Let $\Sigma = \{p_1, p_2, \ldots, p_k\}$ be a set of distinct primes called semantic primes. Each semantic prime represents a fundamental dimension of meaning.
Definition (Prime product): For token $t$ with semantic prime strengths $f_i(t) \in \mathbb{N}$:
Definition ($p$-adic valuation vector):
where $v_{p_i}(n)$ is the exponent of $p_i$ in the prime factorization of $n$.
Definition (Ultrametric distance on valuation vectors):
This is the Chebyshev distance on valuation vectors, which inherits ultrametricity from the $p$-adic valuation.
Demonstration (working code): The ultrametric-ai-poc uses $\Sigma = \{2, 3, 5, 7, 11\}$ mapped to semantic categories $\{\text{good}, \text{bad}, \text{not}, \text{very}, \text{but}\}$ via WordNet hypernym paths. Each English word is assigned semantic primes, the prime product is computed, and the valuation vector is used for ultrametric attention. Cocycle verification (strong triangle inequality) passes on all token triplets [ARCHIVE: ultrametric-ai-poc].
2.4 Distinction Calculus (Spencer-Brown)
[ARCHIVE: Auditable Attention PoC]
The syntactic token calculus is built on Spencer-Brown’s Laws of Form (1969) primitives:
| Primitive | Notation | Meaning |
|---|---|---|
| :---------- | :--------- | :-------- |
| Mark | # | The act of drawing a distinction |
| Enclosure | [A] | A boundary containing expression $A$ |
| Void | (blank) | The absence of any distinction — not a symbol |
Two axioms (rewrite rules):
- Calling (Idempotence): ## $\to$ # — adjacent marks condense into one. Repetition of the same distinction is idle. Corresponds to $A \land A = A$.
- Crossing (Involution):
[[A]]$\to$ A — an enclosure containing only another enclosure cancels. To cross a boundary twice is to uncross it. Corresponds to $\neg\neg A = A$.
Key insight (Q5 from Ultrametric Intelligence/0.1.md): The Bruhat–Tits tree $\mathcal{T}_p$ is isomorphic to the set of all finite bracket expressions under Spencer-Brown’s calculus. Every token path in the tree is a normal form of some distinction expression. The LCA of two leaves is the longest common prefix of their normal forms. This means the distinction calculus IS the formal verification substrate for operations on the Bruhat–Tits tree.
3. Architecture
3.1 Overview
┌─────────────────────────┐
Raw Input ──> │ p-adic Valuation │
(text, data) │ Encoding │
│ token -> P(t) -> v⃗(t) │
└──────────┬──────────────┘
│ valuation vectors
▼
┌─────────────────────────┐
│ Bruhat–Tits Tree T_p │
│ • ultrametric attention │
│ • tree-walk propagation │
│ • cocycle verification │
└──────────┬──────────────┘
│ decision path
▼
┌─────────────────────────┐
Output + │ Syntactic Token Calc │
Audit Trail ←── │ • token history DAG │
│ • type verification │
│ • confluence check │
└─────────────────────────┘
3.2 Token Encoding
Step 1 — Semantic prime assignment: Map each input token $t$ to a set of semantic primes with strengths. For text: WordNet hypernym paths determine which semantic categories apply (as in ultrametric-ai-poc). For structured data: feature hierarchy determines the encoding levels. For unstructured data (images, audio): hierarchical clustering on training data determines the prime assignment [LLM-INFERRED].
Step 2 — Prime product computation:
where $f_i(t) \in \mathbb{N}$ is the strength of semantic prime $p_i$. A larger exponent means stronger association with that semantic dimension.
Step 3 — Valuation vector extraction:
Each component $v_{p_i}(P(t)) = f_i(t)$ — the exponent is exactly the strength. This yields a vector in $\mathbb{N}^k$.
Step 4 — Leaf activation mapping: The valuation vector $\vec{v}(t)$ for a single token does NOT directly map to a single leaf. Instead, the encoding for an entire input sequence produces a distributed leaf activation vector $\mathbf{a} = (a_1, \ldots, a_L) \in [0,1]^L$ across all $L$ leaves. The activation $a_\ell$ at leaf $\ell$ is determined by the ultrametric attention mechanism (§3.3): tokens attend to each other based on tree distances, and the resulting attended valuations are distributed across leaves whose positions correspond to the semantic prime combinations present in the input.
Concrete mapping (from ultrametric-ai-poc): In the working PoC, each unique valuation vector $\vec{v}(t)$ corresponds to a leaf whose path from root is determined by the sequence of valuation component values. Tokens with identical valuation vectors map to the same leaf. The leaf activation $a_\ell$ is the aggregated attention weight from all tokens whose valuation vectors map to leaf $\ell$. This ensures that the leaf activation vector $\mathbf{a}$ inherits the ultrametric structure of the tree — similar tokens produce activations at nearby leaves.
Encoding properties:
- Ultrametric: Distance between any two encodings satisfies the strong triangle inequality
- Hierarchical: Similar tokens (sharing semantic primes) produce activations at nearby leaves
- Discrete: The encoding space is a finite set of $L$ leaves, not a continuous manifold
- Interpretable: Each dimension of $\vec{v}(t)$ is a named semantic category; each leaf corresponds to a specific combination of categories
3.3 Ultrametric Attention
[ARCHIVE: ultrametric-ai-poc]
Attention weights on the Bruhat–Tits tree are computed from ultrametric distances between token encodings. No learned parameters — the attention pattern emerges purely from tree geometry.
Algorithm:
Input: Sequence of tokens $\{t_1, \ldots, t_n\}$, temperature $T > 0$
Output: Attention matrix $A \in \mathbb{R}^{n \times n}$, attended representations $\vec{v}^{\text{out}}_i$
- Encode tokens: $\vec{v}_i = \vec{v}(t_i)$ for $i = 1, \ldots, n$
- Compute pairwise ultrametric distances:
- Compute attention weights (exponential decay):
- Compute attended output:
Properties:
| Property | Standard Attention (Dot-Product) | Ultrametric Attention |
|---|---|---|
| :--------- | :---------------------------------- | :---------------------- |
| Parameter count | $O(d^2)$ (query, key, value matrices) | $0$ (no learned parameters) |
| Similarity | $\langle q_i, k_j \rangle / \sqrt{d}$ | $\exp(-d_{\text{ultra}}(t_i, t_j) / T)$ |
| Interpretability | Opaque — learned projections | Transparent — distance = LCA depth |
| Gauge invariance | No — rotation changes attention | Yes — ultrametric distance is invariant under tree automorphisms |
| Auditability | No traceable explanation | Every weight traces to a specific LCA |
Cocycle condition: For any three tokens $t_i, t_j, t_k$, the ultrametric distances must satisfy:
This is computationally verified at inference time. Violations indicate inconsistency in the cognitive representation [ARCHIVE: ultrametric-ai-poc].
Temperature interpretation: The temperature $T$ controls attention sharpness:
- $T \to 0$: Hard attention — only tokens at identical leaves attend to each other
- $T \to \infty$: Uniform attention — all tokens attend equally
- $T \approx 1$: Soft attention — tokens at nearby leaves (deep LCA) attend strongly
3.4 Output Decoding — Decision Paths
A Q-PNA output is not a probability distribution over class labels. It is a decision path — a traceable sequence of nodes from root to leaf:
where $v_0$ is the root, each $v_i$ is a node at depth $i$, and $c_i \in \{0, 1, \ldots, p\}$ is the child index followed.
Decoding procedure:
- Start at root $v_0$ with the attended representation of the input
- At each internal node $v$, compute the activation of each child $c$ as the sum of attention-weighted valuations in that subtree
- Select the child with maximal activation (greedy) or sample proportionally (stochastic, with temperature)
- Recurse until a leaf is reached
- The leaf’s associated label/action is the output. The path from root to leaf IS the audit trail.
The decision path IS the explanation. No post-hoc interpretability method (LIME, SHAP) is needed. Every decision is a geometric path whose meaning is grounded in the tree structure — the shared LCA relationships with training examples, the semantic primes activated at each branch, and the token history that produced the leaf activation.
From attention to leaf activations: The attended valuation vectors $\vec{v}^{\text{out}}_i$ produced by ultrametric attention (§3.3) are aggregated into a leaf activation vector $\mathbf{a} \in [0,1]^L$ as follows: each valuation vector $\vec{v}^{\text{out}}_i$ is mapped to its corresponding leaf $\ell(i)$ in $\mathcal{T}_p$, and the leaf activation is the sum of attention weights directed to that leaf: $a_\ell = \sum_{i: \ell(i) = \ell} \sum_j A_{ji}$. This aggregated leaf activation vector is what the cophenetic loss (§4) compares against the target encoding $\mathbf{a}^*$.
4. Loss Function on Ultrametric Trees
4.1 Cophenetic Loss
[QWAV-INTERNAL: Q-PNA v0.1 specification]
For a training example with input encoding $\mathbf{a} = (a_1, \ldots, a_L)$ (leaf activations) and target encoding $\mathbf{a}^ = (a_1^, \ldots, a_L^*)$, the expected cophenetic distance is:
where $d_C(i, j) = h(\operatorname{lca}(\text{leaf}_i, \text{leaf}_j))$ is the cophenetic distance between leaves $i$ and $j$.
Cophenetic loss:
Intuition: If the predicted activation and target activation are concentrated on leaves that share a deep LCA (close in the tree), the loss is small. If they’re on distant branches (shallow LCA), the loss is large. This is naturally hierarchical — errors at coarser levels of the hierarchy are penalized more heavily.
4.2 Multi-Resolution Decomposition
The cophenetic loss can be decomposed by tree level:
where $w_\ell = 2^{-\ell}$ (emphasizing coarser levels) and $\mathcal{L}_\ell$ measures error at depth $\ell$:
with $\mathbf{a}_v = \sum_{i \in \text{leaves}(v)} a_i$ being the aggregated activation in the subtree rooted at $v$.
4.3 Properties
| Property | Cross-Entropy (Standard) | Cophenetic Loss (Q-PNA) |
|---|---|---|
| :--------- | :-------------------------- | :------------------------- |
| Space | Continuous $\mathbb{R}^n$ | Discrete tree $\mathcal{T}_p$ |
| Hierarchy awareness | None — all output dimensions are independent | Errors weighted by tree distance — hierarchical structure respected |
| Decomposability | Single scalar | Decomposable by tree level (multi-resolution) |
| Formal verifiability | Approximate (gradients) | Exact — the error at each node has geometric meaning |
| Triadic rigidity | Not applicable | Inherent: for any three examples, two loss pairings are equal |
[LLM-INFERRED] The cophenetic loss is defined but has not been computationally validated as a training objective. Convergence properties, relationship to classification accuracy, and comparison to cross-entropy on standard benchmarks are open questions. The definition is provided as the specification — validation is Phase 0 below.
5. Optimization: Tree-Walk Algorithm
5.1 Why Gradient Descent Doesn’t Apply
Gradient descent requires a differentiable manifold. The Bruhat–Tits tree is a discrete metric space — derivatives are not defined. Q-PNA replaces gradient-based optimization with tree-walk optimization: a discrete search algorithm that redistributes edge weights by decomposing error through the tree hierarchy.
[QWAV-INTERNAL: Q-PNA v0.1 specification]
5.2 Algorithm
Input: Training example $(x, y)$, current tree with edge weights $W = \{w_{v \to c}\}$ for all internal nodes $v$ and children $c$
Output: Updated edge weights $W’$, leaf activations $\mathbf{a}’$
Hyperparameters: Learning rate $\eta > 0$, convergence threshold $\varepsilon > 0$, temperature $T > 0$
Algorithm (one training step):
1. FORWARD PASS:
a. Encode input x $\to$ leaf activations a
b. For each internal node v (bottom-up):
For each child c of v:
Aggregate activation: A_c = sum(attended activations in subtree(c))
Normalize across children: A_c = A_c / sum(A_*)
c. Decode output: greedy path from root $\to$ leaf $\to$ output activations a^out
2. COMPUTE LOSS:
L = L_coph(a^out, a^*) # cophenetic distance to target
3. ERROR DECOMPOSITION (top-down):
For each internal node v at depth ℓ:
E_v = sum_{i in leaves(v)} |a^out_i - a^*_i|
# This is the total error originating in subtree v
4. EDGE WEIGHT UPDATE (discrete gradient analog):
For each internal node v:
E_bar_v = mean(E_c for each child c of v)
For each child c of v:
w_{v$\to$c} ← w_{v$\to$c} - η · (E_c - E_bar_v)
# Redistribute weight away from high-error subtrees
# Normalize: sum(w_{v$\to$*}) = 1
5. LEAF ACTIVATION REFINEMENT:
For each leaf i where |a^out_i - a^*_i| > τ:
a_i ← a_i + η · (a^*_i - a_i)
# Move leaf activation toward target for large errors
6. REPEAT steps 1–5 for T iterations or until L < ε
5.3 Relationship to Backpropagation
| Backpropagation | Tree-Walk Optimization |
|---|---|
| :---------------- | :---------------------- |
| Gradient $\partial \mathcal{L} / \partial w$ | Subtree error $E_v$ |
| Chain rule through layers | Tree-walk from root to leaves (top-down decomposition) |
| Continuous weight update | Discrete weight redistribution |
| Vanishing gradients in deep networks | Error naturally attenuates with tree depth (strong triangle inequality: errors at deep nodes are contained within their subtree) |
| Requires differentiable activations | Works on any activation — only distances and aggregations needed |
5.4 Convergence Conditions
[LLM-INFERRED] The following convergence conditions are hypothesized but not proven:
- Monotonicity: $\mathcal{L}_{\text{coph}}$ should decrease monotonically if $\eta$ is sufficiently small. The error propagation through the tree via $E_v$ is conservative (total error is preserved, only redistributed). This is the discrete analog of the gradient being a descent direction.
- Learning rate bound: $\eta$ must be bounded by the minimum edge weight divided by the maximum subtree error difference. Too large $\eta$ causes weight oscillation (weights flip between extreme values).
- Tree depth stability: Deeper trees ($D > 10$) may suffer from error dilution — the error signal at deep nodes becomes small relative to the total error, slowing convergence. This is analogous to vanishing gradients but structurally different: the error is diluted spatially (across many subtrees) rather than multiplicatively.
- Convexity analog: The cophenetic loss is not convex in the standard sense (the domain is discrete). However, the loss landscape on the tree may exhibit a property analogous to geodesic convexity in ultrametric spaces.
5.5 Computational Complexity
For a truncated tree $\mathcal{T}_p$ of depth $D$ with $L = (p+1) \cdot p^{D-1}$ leaves:
| Operation | Complexity | Notes |
|---|---|---|
| :---------- | :----------- | :------ |
| Encoding (per token) | $O(k \cdot \log M)$ | $k$ semantic primes, $M$ max product value |
| Ultrametric attention ($n$ tokens) | $O(n^2 \cdot k)$ | Pairwise distances, $k$ valuation dimensions |
| Forward pass (tree propagation) | $O(L)$ | Each leaf aggregated once, bottom-up |
| Error decomposition | $O(L)$ | Each node visited once, top-down |
| Edge weight update | $O(L)$ | Each edge updated once |
| Per training step | $O(n^2 k + L)$ | Quadratic in sequence length, linear in tree size |
| Standard transformer (comparison) | $O(n^2 d)$ | Quadratic in sequence length, $d$ = embedding dimension |
For near-term feasibility: $p=3$, $D=5$ gives 324 leaves — computationally tractable on a laptop. $D=10$ gives ~80,000 leaves — still feasible for batch processing. The dominant cost is not the tree but the $O(n^2 k)$ pairwise attention.
5.6 Quantum Walk Speedup
[LLM-INFERRED — aspirational, not validated]
When implemented on quantum hardware, the forward pass benefits from quantum walk speedup: $O(\sqrt{n})$ time to traverse the tree vs. $O(n)$ classically. This is the primary quantum advantage — not in the optimization step, but in the inference (forward pass) step. The quantum walk is a continuous-time quantum walk on the tree graph, which can reach target leaves quadratically faster than classical random walks.
Honest note: This depends on fault-tolerant quantum hardware not currently available. The classical simulation path (tree-walk on a CPU/GPU) is the near-term implementation strategy.
6. Syntactic Token Calculus
6.1 Purpose
[ARCHIVE: Syntactic Token Calculus v2/v3]
The Syntactic Token Calculus (STC) is a formal framework for reasoning about computations on the Bruhat–Tits tree. It provides:
- Deterministic confluence: Every computation has a unique, traceable path through the tree
- Formal verification: Decisions can be proven correct (or incorrect) by examining the token transformations along the path
- Elimination of singularities: By treating reality as discrete topological enclosures (tokens), the calculus avoids singularities that plague continuous models
- Gauge-invariant meaning: Cross-ratio invariants define meaning independent of tree automorphisms
[ARCHIVE: Syntactic Token Calculus v3.1]
6.2 Token Definition
A token $\tau$ is a discrete topological enclosure — a labeled node in the Bruhat–Tits tree with associated data:
| Component | Type | Description |
|---|---|---|
| :---------- | :----- | :------------ |
| $v$ | $\mathcal{T}_p$ node | Position in the tree (depth, path from root) |
| $\text{type}$ | $\mathcal{T}$ (finite type system) | The token’s type — determines valid operations |
| $\text{data}$ | Discrete payload | The token’s information content (integer, string, or nested tokens) |
| $\text{parent}$ | Token reference or $\varnothing$ | The parent token — $\varnothing$ for root tokens |
6.3 Token Operations
All computation in STC is expressed as token operations on the tree:
| Operation | Symbol | Description | Type Constraint |
|---|---|---|---|
| :---------- | :------- | :------------ | :---------------- |
| Spawn | $\tau \to \{\tau_1, \ldots, \tau_{p+1}\}$ | A token splits into child tokens at depth $\ell+1$ | Children inherit parent type |
| Merge | $\{\tau_1, \ldots, \tau_{p+1}\} \to \tau$ | Child tokens combine into a parent token at depth $\ell$ | All children must have compatible types |
| Transform | $\tau \xrightarrow{f} \tau’$ | Token data is transformed by function $f$ | Type and position unchanged |
| Move | $\tau \xrightarrow{v \to v’} \tau’$ | Token moves to adjacent node (parent, child, or sibling) | Must respect tree adjacency |
| Annihilate | $\tau \to \varnothing$ | Token is removed | Parent reference must be updated |
Type compatibility for Merge: Two types $T_1$ and $T_2$ are compatible if they share a common supertype in the type lattice. The merged token receives the least upper bound type: $\operatorname{lub}(T_1, T_2)$.
6.4 Computation as Token History
A computation is a token history — a directed acyclic graph (DAG) of token operations, where each edge represents an operation and each node represents a token state:
Properties:
- Causality: Every operation has well-defined inputs and outputs
- Traceability: Every output token can be traced back to its originating input tokens
- Deterministic confluence: For any two paths through $H$ that reach the same token, the resulting token state is identical
6.5 Verification Protocol
To verify that output $y$ is correct for input $x$:
- Reconstruct the token history $H$ from the Q-PNA’s decision trace
- Check type consistency: Every token operation respects the type system — spawn preserves types, merge requires compatible types, transform preserves type
- Check path validity: Every Move operation respects the tree structure — tokens may only move to parent, child, or sibling nodes
- Check confluence: For any two paths through $H$ that reach the same token, the resulting token state is identical (determinism)
- Check against specification: The final output token’s data matches the specification for input $x$
Result: If all checks pass, the computation is provably correct — the decision path IS the proof.
6.6 Relationship to Distinction Calculus
[ARCHIVE: Auditable Attention PoC]
The STC operations can be expressed as Spencer-Brown distinction calculus operations:
| STC Operation | Distinction Calculus | Meaning |
|---|---|---|
| :-------------- | :--------------------- | :-------- |
| Spawn | $\tau \to \{\tau_1, \ldots, \tau_{p+1}\}$ | Enclose: $\tau \to [\tau_1 \tau_2 \ldots \tau_{p+1}]$ |
| Merge | $\{\tau_1, \ldots, \tau_{p+1}\} \to \tau$ | Cross: $[[\tau_1 \ldots \tau_{p+1}]] \to \text{reduced form}$ |
| Transform | $\tau \xrightarrow{f} \tau’$ | Calling: repeated marks condense, or a function application |
| Annihilate | $\tau \to \varnothing$ | Void: the token returns to the unmarked state |
| Move | $\tau \xrightarrow{v \to v’} \tau’$ | Re-entry: the token appears at a different structural position |
This connection means that every STC computation has a purely syntactic interpretation — it can be expressed without real numbers, only marks and enclosures. The distinction calculus is the formal verification substrate.
7. Glass-Box Verification Protocol
7.1 Glass-Box Definition
A Q-PNA model is glass-box (as opposed to black-box) if it satisfies all five conditions:
- Path traceability: Every output includes the full decision path from root to leaf
- Operation auditability: Every token operation along the path is logged and inspectable
- Type consistency: All operations respect the STC type system
- Deterministic confluence: The same input always produces the same decision path (up to stochastic sampling, where the sampling distribution is logged)
- Cocycle satisfaction: The strong triangle inequality holds for all token triplets encountered
7.2 Audit Trail Format
Every Q-PNA decision produces an audit trail:
Decision: [output label]
Path: root $\to$ node[2] $\to$ node[2,1] $\to$ node[2,1,3] $\to$ leaf[47]
Token operations:
[spawn] root $\to$ {τ_1, τ_2, τ_3, τ_4}
[transform] τ_2 $\to$ f_embed(τ_2)
[move] τ_2 $\to$ child[2,1]
[merge] {τ_2, τ_aux} $\to$ τ_out
[transform] τ_out $\to$ classifier(τ_out) = [output label]
Tree context:
LCA(τ_2, τ_aux) = node[2] at depth 2
d_ultra(τ_2, τ_aux) = 2
Verification:
type-consistent ✓ | path-valid ✓ | confluent ✓ | cocycle ✓
This audit trail is human-readable (a regulator can understand the decision) AND machine-verifiable (automated checks pass/fail each condition).
7.3 Comparison: Black-Box vs. Glass-Box
| Property | Black-Box AI (Standard DNN) | Glass-Box AI (Q-PNA) |
|---|---|---|
| :--------- | :--------------------------- | :---------------------- |
| Decision path | Opaque composition of nonlinearities | Geometric path root $\to$ leaf |
| Audit trail | Approximate (LIME, SHAP, attention viz) | Exact — the path is the explanation |
| Formal verification | Impossible in general | Possible via token calculus type checking |
| Regulatory compliance | Requires external auditing | Self-documenting by construction |
| Error attribution | Statistical (which input features?) | Structural (which subtree?) |
| Bias detection | Post-hoc statistical analysis | Pre-computation: biased paths visible in tree structure |
| Adversarial robustness | Fragile to $\varepsilon$-perturbations | Gauge-invariant — no continuous corridors to exploit [ARCHIVE: Auditable Attention PoC v0.2.1] |
7.4 Cocycle Condition as Consistency Check
[ARCHIVE: ultrametric-ai-poc]
The cocycle condition is a runtime verification that the cognitive representation is internally consistent. For any three tokens $t_i, t_j, t_k$:
If this fails for any triple, the representation contains an inconsistency — the tokens cannot be simultaneously embedded in an ultrametric space. The violation is logged and flagged for investigation.
Interpretation: The cocycle condition is to cognitive consistency what conservation laws are to physics — a structural invariant that must hold. A violation indicates that the semantic prime assignment or the token encoding has produced a contradictory representation.
8. Relationship to Quantum Architecture
8.1 Shared Mathematical Foundation
[QWAV-INTERNAL: Technical Deep-Dive §4]
Both Q-PNA (AI) and UQC (Quantum Computing) operate on the same mathematical structure — the Bruhat–Tits tree $\mathcal{T}_p$ — but exploit different properties:
| Property | UQC (Quantum) | Q-PNA (AI) |
|---|---|---|
| :--------- | :-------------- | :----------- |
| Primary use of tree | Error confinement via strong triangle inequality | Hierarchical feature organization |
| Encoding | $q$-ary scatter across leaves | $p$-adic valuation vectors |
| Propagation | Holographic tensor network (perfect tensors) | Classical tree-walk (quantum walk optionally) |
| Key mechanism | Passive fault tolerance — no active QEC needed | Glass-box explainability by construction |
| Verification | Logical error rate (LER) simulation | Token calculus formal verification |
| Hardware | 40-atom neutral atom at 4 K (validated computationally) | Room temperature (classical simulation) |
8.2 The Common Thesis
> Continuous manifolds are the wrong mathematical foundation for computation. Ultrametric (tree-based) geometry provides structural properties — error confinement for quantum, hierarchical transparency for AI — that are unavailable in Archimedean spaces.
8.3 Complementary Roles
- Quantum side provides the hardware pathway: 40-atom neutral atom implementation at 4 K, passive fault tolerance, $q$-ary scatter amplification, with computational validation published (Tier 0 + Tier 1 papers, DOI: 10.5281/zenodo.20134944, 10.5281/zenodo.20208437)
[EXTERNAL-SOURCE] - AI side provides the software pathway: glass-box decision-making, formal verifiability, token calculus for regulatory compliance — specified here, not yet validated
- Together: A complete computing architecture — hardware AND software — built on a single mathematical correction to the Archimedean assumption
9. Computational Validation Plan
[LLM-INFERRED] The following phases are specified but not yet executed. This is a research roadmap, not a progress report.
9.1 Phase 0: Tree-Walk Optimization Simulation
Goal: Demonstrate that tree-walk optimization converges on a toy hierarchical classification task.
Setup:
- Tree: $p=3$, depth $D=5$ (324 leaves)
- Task: Synthetic hierarchical classification (e.g., 3-level taxonomy with 27 classes)
- Encoding: $p$-adic valuation vectors with $k=5$ semantic primes
- Metrics: Cophenetic loss over training epochs, classification accuracy, audit trail completeness
Success criterion: Loss decreases monotonically over 100 epochs on synthetic data. Classification accuracy exceeds random baseline by significant margin.
9.2 Phase 1: Token Calculus Implementation
Goal: Implement the STC type system and verification protocol in Python.
Setup:
- Define token types, operations, and verification rules
- Generate token histories from simulated Q-PNA decisions
- Verify type consistency, path validity, and confluence
Success criterion: 100% of valid token histories pass verification. 100% of intentionally corrupted histories (type mismatch, invalid move, non-confluent paths) fail verification.
9.3 Phase 2: Integration with Quantum Simulation
Goal: Demonstrate that the same tree topology used for quantum error confinement can encode hierarchical features for AI classification — proving the shared-geometry thesis.
Setup:
- Use the
ultrametric_v2codebase’s Bruhat–Tits tree implementation[QWAV-INTERNAL: ultrametric_v2 codebase] - Encode a hierarchical dataset on the SAME tree used for quantum error simulation ($p=3$, $D=7$)
- Run both quantum error simulation AND AI classification on the same tree
Success criterion: The tree simultaneously suppresses quantum errors AND organizes hierarchical features. (This is the strongest form of the common thesis.)
9.4 Phase 3: Comparative Benchmarking
Goal: Compare Q-PNA against standard architectures on glass-box metrics.
Metrics:
- Audit trail completeness (% of decisions with full trace)
- Verification success rate (% of outputs provably correct via STC)
- Decision path length vs. model depth
- Training convergence (cophenetic loss vs. cross-entropy)
- Adversarial robustness (accuracy under perturbation, compared to equivalent-capacity MLP)
10. Honest Limitations & Open Questions
10.1 What This Specification Does NOT Claim
- ❌ Q-PNA has not been implemented or computationally validated as a full architecture
- ❌ No empirical results exist for tree-walk optimization convergence
- ❌ Quantum walk speedup assumes fault-tolerant quantum hardware not yet available
- ❌ Token calculus has not been formalized in a proof assistant (Lean/Coq)
- ❌ The AI side is less developed than the quantum side — this specification is a starting point, not a finished product
- ❌ No comparison to transformers, GNNs, or other architectures on standard benchmarks
10.2 What IS Validated
- ✅ Ultrametric attention with $p$-adic valuation encoding: demonstrated in working Streamlit app (
ultrametric-ai-poc) - ✅ Cophenetic distance ultrametric inequality: proven and computationally verified (
Tree Distance Cophenetic.md) - ✅ Syntactic token calculus: book-length formal treatment exists (STC v2, v3)
- ✅ Distinction calculus primitives: formalized (Spencer-Brown 1969, STC 0.1.md)
- ✅ Cocycle verification: implemented and tested (
cocycle.py) - ✅ Quantum side: two published papers with computational validation of error confinement
10.3 Open Research Questions
- Convergence guarantees: Under what conditions does tree-walk optimization converge? Is there a discrete analog of the universal approximation theorem?
- Scalability: What is the computational complexity of tree-walk optimization as a function of tree depth $D$ and branching factor $p+1$? Can it scale to ImageNet-size problems?
- Quantum advantage threshold: At what tree size does quantum walk speedup become practically significant? What is the crossover point vs. classical simulation?
- Expressiveness: What function classes can Q-PNA represent? Is the tree structure a limitation or a feature?
- Learned hierarchies: For unstructured data (images, raw audio), how should the semantic prime assignment be learned? Is hierarchical clustering sufficient, or is end-to-end learning required?
- Token calculus completeness: Is the STC type system expressive enough for general computation, or is it restricted to a specific class of problems?
- Hybrid architectures: Can Q-PNA be combined with standard neural components? E.g., CNN for feature extraction $\to$ Q-PNA for classification with glass-box audit trail?
- Distinction calculus bridge: The five open questions from
Ultrametric Intelligence/0.1.mdremain open — the formal bridge between Spencer-Brown primitives and ultrametric attention has not been completed.
- Batch semantics: The STC assumes single-token operations. Batch computation semantics for training efficiency have not been defined.
- Optimal semantic prime selection: The current semantic prime set $\{2, 3, 5, 7, 11\}$ is heuristic. Is there a principled method for selecting semantic primes from data?
10.4 What Closes These Questions
- Computational validation (Phases 0–3 in §9)
- Formal analysis of the tree-walk optimization algorithm
- Comparative benchmarking against standard architectures
- Reader review and critique from the AI/ML community
- Formalization of the token calculus in Lean 4 or Coq
11. Comparison to Standard Architectures
11.1 Transformers
| Dimension | Transformer | Q-PNA |
|---|---|---|
| :---------- | :------------ | :------ |
| Representation space | $\mathbb{R}^d$ (continuous) | Bruhat–Tits tree $\mathcal{T}_p$ (discrete) |
| Attention | Learned $Q, K, V$ projections ($O(d^2)$ params) | Ultrametric distance decay ($0$ learned params) |
| Position encoding | Sinusoidal or learned | Inherent in tree position |
| Interpretability | Attention visualization (post-hoc) | Decision path (inherent) |
| Training | Gradient descent via backprop | Tree-walk optimization (discrete) |
| Adversarial robustness | Fragile | Gauge-invariant embedding |
| Scaling | Demonstrated at 100B+ params | Not yet demonstrated |
11.2 Graph Neural Networks (GNNs)
| Dimension | GNN | Q-PNA |
|---|---|---|
| :---------- | :---- | :------ |
| Graph structure | Input-dependent (arbitrary graph) | Fixed tree $\mathcal{T}_p$ |
| Message passing | Learned aggregations over neighbors | Ultrametric attention over tree paths |
| Hierarchy | Must be learned or provided | Inherent in tree structure |
| Explainability | GNNExplainer, post-hoc | Decision path IS the explanation |
| Formal verification | Not available | Token calculus verification protocol |
11.3 Decision Trees / Random Forests
| Dimension | Decision Tree | Q-PNA |
|---|---|---|
| :---------- | :-------------- | :------ |
| Tree structure | Learned from data (greedy splits) | Fixed Bruhat–Tits tree (mathematically structured) |
| Decision | Single path (hard) | Soft path (attention-weighted) |
| Interpretability | High (if shallow) — path is readable | High — path is readable + formally verifiable |
| Expressiveness | Limited (axis-aligned splits) | Potentially richer (ultrametric combinations) |
| Training | Greedy recursive partitioning | Tree-walk optimization (global error redistribution) |
11.4 Where Q-PNA Could Excel
Q-PNA is not proposed as a universal replacement for all neural architectures. Its advantages are specific:
- Regulated domains (healthcare, finance, law): Where decisions must be explainable and auditable. The glass-box audit trail meets regulatory requirements that black-box models cannot.
- Hierarchical data (taxonomies, ontologies, biological classification): Where the tree structure naturally matches the data structure. The cophenetic loss directly optimizes hierarchical accuracy.
- Adversarial environments (security, fraud detection): Where gauge invariance provides inherent robustness to perturbation-based attacks.
- Formal verification requirements (safety-critical systems): Where the token calculus enables machine-verifiable correctness proofs.
- Quantum-classical hybrid systems (future): Where the shared tree geometry with QWAV’s quantum architecture enables a unified computing stack.
12. Prior Work Attribution
This specification synthesizes the following prior work:
| Work | Location | Contribution |
|---|---|---|
| :--------------------------------------- | :----------------------------------------------- | :-------------------------------------------------------------------------------------------------------------------------------- |
| Ultrametric AI PoC | [ARCHIVE] ultrametric-ai-poc | Working demonstration: $p$-adic valuation encoding, ultrametric attention, cocycle verification, distinction calculus integration |
| Auditable Attention PoC | [ARCHIVE] Auditable Attention PoC | Technical narrative: gauge-invariance framing, STC v3.1 foundations, Spencer-Brown primitives |
| Ultrametric Intelligence | [ARCHIVE] Ultrametric Intelligence | Synthesis: non-Archimedean geometry + AI. Five open questions bridging distinction calculus and ultrametric attention |
| STC v2/v3 | [ARCHIVE] Syntactic Token Calculus v2/v3 | Book-length formal treatment of token calculus — type system, operations, verification |
| Tree Distance Cophenetic | [DOI: 10.5281/zenodo.20213043] | Mathematical foundation: cophenetic distance, triadic rigidity, ultrametric inequality proofs |
| Language as Information Architecture | [DOI: 10.5281/zenodo.20137616] | Empirical linguistics: token encoding theory, mutual exclusion |
| PANN | [ARCHIVE] PANN (github.com/rwnq8/PANN) | Prior art: $p$-adic attention in PyTorch, hierarchical loss functions, topological regularization |
| QWAV Quantum Papers | Zenodo | Quantum side validation: error confinement at $p=3$, $D=7$, zero logical errors |
| QWAV v0.1 Spec | [QWAV-INTERNAL] Q-PNA v0.1 | Original Q-PNA specification — starting point for this document |
Data Availability
All archival project files referenced in this specification are publicly available through Google Drive:
All code, specifications, and computational results for this project are available at the public repository: github.com/QNFO/Q-PNA
This archive includes: ultrametric-ai-poc (working Streamlit demonstration), Proof-of-Concept for Auditable Attention using Ultrametric Tree Distances, Ultrametric Intelligence, Syntactic Token Calculus v2 and v3, Formal Ontology of Distinction and Invariance, and related projects. PANN is available at github.com/rwnq8/PANN.
Published works with DOIs are available through Zenodo at the links provided. QWAV internal working documents (QWAV-INTERNAL) are available upon request.
References
- Quni-Gudzinas, R. B. (2026). The Tree Distance Cophenetic: A Unified Framework for Hierarchical Ontology. Zenodo. DOI: 10.5281/zenodo.20213043
- Quni-Gudzinas, R. B. (2026). Language as Information Architecture. Zenodo. DOI: 10.5281/zenodo.20137616
- Quni-Gudzinas, R. B. (2026). Computational Validation of Ultrametric Error Confinement in Bruhat–Tits Tree Quantum Circuits. Zenodo. DOI: 10.5281/zenodo.20134944
- Quni-Gudzinas, R. B. (2026). Symmetric Extension of Ultrametric Error Confinement. Zenodo. DOI: 10.5281/zenodo.20208437
- Spencer-Brown, G. (1969). Laws of Form. George Allen and Unwin.
- [ARCHIVE] ultrametric-ai-poc — Ultrametric AI Proof of Concept. Public archive: Google Drive
- [ARCHIVE] Proof-of-Concept for Auditable Attention using Ultrametric Tree Distances. Public archive.
- [ARCHIVE] Ultrametric Intelligence — Synthesis of Non-Archimedean Geometry and AI. Public archive.
- [ARCHIVE] Syntactic Token Calculus v2 and v3. Public archive.
- [ARCHIVE] PANN — Prime-Attentive Neural Networks. github.com/rwnq8/PANN
- [QWAV-INTERNAL] Q-PNA v0.1 Specification. Available upon request.
- [QWAV-INTERNAL] Technical Deep-Dive — Ultrametric Quantum Computing and AI. Available upon request.