Derivation of the Free Energy Principle from UHM
This document contains proofs of the connection between the categorical definition of φ and the variational principle, as well as the derivation of Friston's FEP as the classical limit of UHM. Theorem 3.1 — [Т] (primitivity of the linear part proven). Theorem 4.1 — [Т] (classical limit). Theorem 4.2 — [Т] (identification of generative model = definition of self-reference). Theorem 4.3 — [Т] (complete reduction of to ). Theorem 5.1 — [Т].
1. Problem Statement
1.1 Two definitions of φ
In UHM, the self-modeling operator φ has two representations:
Canonical definition (categorical):
φ is defined as the left adjoint to the canonical inclusion of subobjects.
Variational characterization:
1.2 Questions this document answers
| Question | Status |
|---|---|
| Proof of equivalence of the two definitions | Theorem 3.1 [Т] (upgraded from [С]) |
| Classical limit of the variational principle | Theorem 4.1 [Т] |
| Connection to Friston's FEP | Theorem 4.2 [Т] (closed: generative model = definition of self-reference) |
| Complete reduction of to | Theorem 4.3 [Т] |
| Justification of vs | Theorem 5.1 [Т] |
2. Categorical Foundations
2.1 ∞-topos structure
Let be the ∞-topos of sheaves over the category of density matrices with the Grothendieck topology .
Key elements:
- — subobject classifier
- Characteristic morphism: for
2.2 Subobject category Sub(Γ)
Definition 2.1. is a category whose objects are monomorphisms , and morphisms are commutative triangles.
In the context of UHM, is interpreted as the category of logically consistent states — those satisfying the internal logic .
Key property: In the ∞-topos, is a lattice with greatest element and least element .
2.3 Operator φ as left adjoint
Definition 2.2 (Canonical definition of φ).
is defined as the left adjoint to the canonical inclusion :
Universal property: For all and :
Interpretation: is the best (minimal) logically consistent approximation of .
2.4 φ as co-reflector
From the theory of adjoint functors it follows that:
Lemma 2.1. is a co-reflector:
where the colimit is taken over the diagram of all subobjects .
3. Theorem on Variational Characterization
3.1 Preliminary definitions
Definition 3.1 (Spectral entropy).
For a density matrix with eigenvalues :
Remark: In this context for density matrices. The distinction arises only for non-Hermitian operators (see §5).
Definition 3.2 (Quantum KL-divergence).
For density matrices with :
Definition 3.3 (Variational functional).
where is a CPTP channel (completely positive trace-preserving).
3.2 Central theorem
Let be defined categorically as the left adjoint to the inclusion .
Then:
where is the invariant measure on the state space (uniqueness of is guaranteed by primitivity of the linear part [Т]).
3.3 Proof
Step 1: Connection of φ with the logical Liouvillian.
From the theorem on stationary distribution:
where is the logical Liouvillian.
Step 2: Dissipative structure of .
has Lindblad form:
where are operators derived from the classifier atoms (see Axiom Ω⁷ §classifier-atoms).
Step 3: Connection of dissipation with entropy.
For Lindblad evolution the following holds (Spohn, 1978):
with equality at the stationary state.
Step 4: Variational formulation of stationarity.
The stationary state is characterized by the condition:
This is equivalent to the minimum of entropy production:
where is the entropy production function.
Step 5: Explicit form of the functional.
For a CPTP channel , the entropy production function has the form (Lindblad, 1975):
With the choice (the initial state as reference):
The identification Γ_ref = Γ is a motivated definition (self-referential minimization), not a derivation from L_Ω. Motivation: autopoiesis (A1) requires that the system minimize the difference between itself and its model, which corresponds to Γ_ref = Γ. Alternative choices (Γ_ref = I/7 or Γ_ref = ρ*) give different functionals. The choice Γ_ref = Γ is the unique one for which the minimum of F coincides with the fixed point of φ (theorem).
Since does not depend on :
Step 6: Identification.
Taking into account for density matrices (Theorem 5.1):
3.4 Remarks on the proof
Remarks:
- Existence and uniqueness of the invariant measure are guaranteed by primitivity of the linear part [Т] (Evans 1977, Spohn 1976)
- The equality holds only for normal operators (Theorem 5.1 [Т])
Categorical correctness:
- Steps 1–2 follow from L-unification
- Steps 3–5 use standard open quantum systems theory
- The identification in Step 6 establishes the desired equivalence
4. Classical Limit: Complete Derivation of FEP
In this section we rigorously show that Friston's FEP is a special case of the UHM variational principle, arising in the transition to the classical limit. The derivation consists of three stages: (i) definition of the classical limit as (decoherence), (ii) reduction of the quantum functional to the classical one, (iii) identification of UHM elements with Friston's constructions. The section concludes with an analysis of what is lost in the classical limit.
4.1 Friston's FEP (original formulation)
According to Friston (2010), an agent interacting with the environment minimizes variational free energy:
where:
- — hidden (latent) states of the world
- — observations (sensory data)
- — recognition density — the agent's internal model
- — generative density — joint model
- — energy of the generative model
- — Shannon entropy of the recognition density
Key inequality (evidence lower bound, ELBO):
Proof: , and .
Equivalent form via KL-divergence:
4.2 Classical limit of UHM: formalization via
Definition 4.1 (Classical limit of UHM).
The classical limit of UHM is defined by two equivalent conditions:
(a) Decoherence of off-diagonal elements. The density matrix loses coherences:
(b) Zero-reflection limit. The reflection measure tends to its minimum:
Lemma 4.1 (Classical limit).
(a) Decoherence (γ_{ij} → 0 for i ≠ j) implies R → 1/(7·1) = 1/7 (since P → max_i γ_{ii}² ≤ 1, and for equilibrium diagonal Γ: P ≈ 1/7, R ≈ 1).
(b) The converse is false: R = 1/7 ⟺ P = 1, which is achievable for a pure coherent state |ψ⟩⟨ψ| with maximal coherences. The classical limit is defined by condition (a) — decoherence, not through R.
Proof. (a) When for , the purity reduces to . Off-diagonal coherences enter directly into the Gap operator (see Gap dynamics); at all Gap elements vanish: . For an equilibrium diagonal matrix : , . For a single dominant : , .
(b) Counterexample: pure state gives , , but has maximal coherences for all . Therefore is not equivalent to decoherence.
Physical meaning. The classical limit is complete decoherence: the system loses all quantum correlations between dimensions. In terms of consciousness: the system is not integrated (), not reflexive (), has no Gap structure. This is the world of purely classical probabilities.
(c) Restriction of the CPTP channel. In the classical limit, the CPTP channel preserves diagonality:
The channel degenerates into a stochastic matrix — a classical Markov transition.
4.3 Reduction of the quantum functional
In the classical limit ( for ), the UHM variational functional (Theorem 3.1) reduces to the classical variational free energy:
Proof.
Step 1 (Spectral entropy → Shannon entropy). For diagonal matrices, eigenvalues coincide with diagonal elements:
This is a direct consequence of Theorem 5.1 ( for density matrices).
Step 2 (Quantum KL → classical KL). For diagonal matrices, the Umegaki quantum divergence reduces:
Step 3 (Substitution). Combining steps 1 and 2:
Remark. The averaging in Theorem 3.1 reduces in the classical limit to the ordinary mathematical expectation over the stationary distribution of a Markov chain (uniqueness guaranteed by primitivity of ).
4.4 Derivation of Friston's classical variational free energy
Now we perform the complete derivation of from the variational characterization of .
Step 1 (Identification of variables). Within UHM, introduce the identification:
| UHM | FEP (Friston) | Meaning |
|---|---|---|
| (generative density) | Full system state = generative model of the world | |
| (recognition density) | Self-model = approximate inference | |
| Optimal self-model = true posterior | ||
| Variational free energy | ||
| Variational inference |
Step 2 (Expanding the functional). Write in the classical limit:
Simplifying:
In the continuous limit (, sums → integrals):
This is exactly Friston's variational free energy.
Step 3 (Equivalent forms). Expanding :
The first term is complexity (deviation from prior), the second is accuracy (expected likelihood). Minimization of = balance of accuracy and complexity — this is the classical analog of balancing spectral entropy and KL-divergence in Theorem 3.1.
Let be the state of a holon in the classical limit ( for ). Then:
(i) The self-modeling operator in the classical limit is identified with the recognition density: .
(ii) The density matrix is identified with the generative model: .
(iii) Minimization of the UHM functional coincides with minimization of Friston's free energy:
(iv) The key inequality (ELBO) is derived automatically:
Closedness of identification [Т]. The identification is not an external assumption but the definition of self-reference. In the variational formulation the divergence measures deviation from the system's own state. A self-referential system by definition uses itself as a generative model — this is not an assumption but a tautology of self-modeling.
Proof of (iv). From the definition of KL-divergence:
Therefore:
Since the stationary state of a primitive Lindbladian maximizes entropy among reachable states (Frigerio, 1978), we have . Therefore, the ELBO lower bound takes the form:
In the classical limit , and the inequality takes the form , if is identified with the entropy of the marginal probability of observations.
4.5 Spectral entropy + KL → variational free energy
We show explicitly how minimization of the quantum functional in the classical limit becomes minimization of Friston's variational free energy.
Let be a diagonal density matrix, a CPTP channel preserving diagonality. Then the problem
is identical to the problem
where is the -simplex of probability distributions.
Moreover, the minimum is achieved at (recognition density coincides with generative), which corresponds to (identity channel).
Proof.
Step 1 (Parameterization). In the classical limit, optimization over CPTP channels preserving diagonality is equivalent to optimization over stochastic matrices with . The result , where . The set of reachable for fixed is a convex subset of containing (at ).
Step 2 (Explicit functional). By Theorem 4.1:
This is the cross-entropy .
Step 3 (Minimization). The cross-entropy is minimized at (by the Lagrange multiplier method with the constraint , or from the property , with equality at ).
Step 4 (Identification with FEP). Friston's free energy:
Upon discretization :
UHM functional in the classical limit:
The difference is an additive term , which does not depend on and therefore does not affect the optimal for fixed :
Thus, the optimal recognition densities coincide.
4.6 Correspondence of constructions
Full correspondence table between UHM and FEP:
| UHM construction | FEP construction | Limiting transition |
|---|---|---|
| — generative model | , | |
| — categorical self-model | — optimal recognition density | Bayesian inversion |
| — spectral entropy | — Shannon entropy | |
| — quantum KL | — classical KL | Diagonal limit |
| — variational functional | — free energy | Theorem 4.1 |
| Primitivity of | Ergodicity of Markov chain | Lindblad → Markov generator |
| CPTP channel | Stochastic matrix | Complete positivity → positivity |
| — fixed point | Posterior distribution | Self-modeling → Bayesian inference |
| Markov blanket (algebraic) | Markov blanket (graphical) | → conditional independence graph |
4.7 What is lost in the classical limit
The transition destroys three fundamental structures of UHM that have no classical analogs.
4.7.1 Coherences (off-diagonal elements)
Quantum coherences () are correlations between the holon's dimensions not describable by classical probabilities.
In the full UHM functional, coherences contribute:
where the quantum remainder:
At this term vanishes: .
Consequence. Classical FEP cannot describe information integration between dimensions ( depends on coherences), quantum qualia (structure of subspaces ), or interference between different aspects of the self-model.
4.7.2 Gap operator
The Gap operator describes opacity between dimensions. In the classical limit:
The Gap structure disappears completely. This means loss of:
- Consciousness bifurcations (discontinuous transitions between regimes)
- Non-Markovian memory effects (Gap oscillations)
- Hamming code H(7,4) in the error correction structure (see Gap dynamics)
4.7.3 Regeneration
The regenerative term of the evolution equation is responsible for nonlinear feedback: the system actively restores coherences rather than passively dissipating.
In the classical limit ():
since regeneration operates on off-diagonal elements. Only the linear dissipation remains, which in the classical limit reduces to a Markov generator:
— the standard Q-matrix of a continuous Markov chain.
Consequence. Classical FEP describes only passive minimization of free energy (dissipation). The quantum FEP of UHM includes active regeneration — the system's ability to restore complex structures lost during decoherence.
4.8 What is preserved: prediction error minimization
Despite the losses, the core of the variational principle survives the classical limit:
The principle "the system minimizes a functional balancing accuracy and complexity" is preserved across all regimes:
| Regime | Functional | Accuracy | Complexity |
|---|---|---|---|
| Quantum (UHM) | |||
| Classical (FEP) |
Prediction error minimization (PEM) is the classical limit of categorical self-modeling .
This explains why Friston's FEP works for classical systems (the brain in the neurocomputational description, biological organisms): it captures the invariant core, though losing quantum structure.
4.9 Structural diagram
┌─────────────────────────────────────────────────────────────────┐
│ UHM (∞-topos) │
│ │
│ Level 0: Ω (primitive) │
│ ↓ │
│ Level 1: φ ⊣ i (categorical definition) │
│ ↓ │
│ Level 2: φ = lim e^{tℒ_Ω}[Γ] (dynamical) │
│ ↓ │
│ Level 3: φ = argmin [S_spec + D_KL] (variational, T 3.1) │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Classical limit: Γ_ij → 0, R → 1/7 │ │
│ │ (lost: coherences, Gap, ℛ) │ │
│ │ ↓ │ │
│ │ ┌───────────────────────────────────────────────────┐ │ │
│ │ │ Friston's FEP: min F = min [⟨E⟩ - H] │ │ │
│ │ │ (classical probabilities, SPECIAL CASE) │ │ │
│ │ │ (preserved: prediction error minimization) │ │ │
│ │ └───────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
5. S_spec vs S_vN: justification of the choice
5.1 Definitions
Von Neumann entropy:
Spectral entropy:
where are eigenvalues of operator .
5.2 When they coincide
For density matrices (Hermitian, positive semi-definite, unit trace):
Proof: For all , therefore .
5.3 Why use S_spec in UHM?
Reason 1: Generalization to non-Hermitian operators.
In some formalisms (Kraus operators, non-physical states), non-Hermitian operators appear. is defined for them; is not.
Reason 2: Connection to Kolmogorov complexity.
In the original UHM formulation (axiom-omega.md):
— spectral entropy (replacing uncomputable Kolmogorov complexity)
Kolmogorov complexity is uncomputable. serves as a computable upper bound:
5.4 Recommendation
For practical purposes in UHM:
- Use for density matrices (standard quantum theory)
- Keep the notation to indicate the connection with complexity theory
- In documentation indicate: " for density matrices"
6. Comparison with Friston's FEP
6.1 Correspondence table
| Aspect | FEP (Friston) | UHM |
|---|---|---|
| Status | Postulate (phenomenological) | Theorem (derived from Ω) |
| Domain | Classical distributions | Quantum states |
| Operator | Implicit | Explicit CPTP channel |
| Justification | Thermodynamics + Bayesian inference | Categorical adjunction |
| Circularity | Not resolved | Resolved (hierarchy Ω → φ) |
| Time | External parameter | Emergent (▷ on Ω) |
6.2 How did Friston derive FEP without UHM?
Friston used three independent arguments:
1. Information-theoretic (Bayesian):
This is an identity — a consequence of the definition of KL-divergence. FEP postulates that systems minimize F.
2. Thermodynamic:
Fluctuation theorems (Jarzynski, Crooks) connect free energy with non-equilibrium work. Stationary systems minimize F for thermodynamic reasons.
3. Cybernetic (self-organization):
Systems that do not minimize surprise "dissipate" — lose their identity. Survival ≡ minimization of F.
6.3 Why is UHM deeper?
1. Categorical justification:
In UHM, φ is defined by the structure of the ∞-topos; the variational principle is a consequence. In FEP, the variational principle is an axiom.
2. Quantum generalization:
UHM works with density matrices (quantum systems). FEP — only with classical distributions.
3. Resolution of circularity:
UHM explicitly constructs the hierarchy: Ω → L_k → ℒ_Ω → φ (see dependency DAG). In FEP the connection between generative model and dynamics is implicit.
4. Emergent time:
In UHM, time is derived from the temporal modality ▷ on Ω. In FEP, time is an external parameter.
7. Consequences for UHM
7.1 Confirmation of consistency
The proof of Theorem 3.1 confirms:
- The variational characterization is a consequence of the categorical definition
- The classical limit reproduces Friston's FEP
- UHM generalizes FEP to the quantum case
7.2 Clarification of statement status
| Statement | Old status | New status |
|---|---|---|
| φ = argmin [S_spec + D_KL] | "Property 4" | Theorem 3.1 [Т] (proven) |
| Classical limit of the functional | Implicit | Theorem 4.1 [Т] (full stat-mech reduction) |
| FEP ⊂ UHM | Claimed | Theorem 4.2 [Т] (identification of generative model = definition of self-reference) |
| Not proven | Theorem 4.3 [Т] (complete reduction) | |
| S_spec = S_vN for ρ | Not clarified | Theorem 5.1 [Т] (proven) |
7.3 New corollaries
Corollary 7.1 (Quantum FEP).
For quantum systems, the generalized principle holds:
where is the initial/reference state.
Corollary 7.2 (Thermodynamic interpretation).
Minimization of is equivalent to minimization of entropy production in an open quantum system.
8. Technical Lemmas
Lemma A.1 (Entropy production in Lindblad dynamics)
For :
Lemma A.2 (Uniqueness of stationary state)
If is primitive (no non-trivial subspaces), then .
Lemma A.3 (Convergence to stationary)
For primitive :
9. References
- Friston K. "The free-energy principle: a unified brain theory?" Nature Reviews Neuroscience 11, 127-138 (2010)
- Spohn H. "Entropy production for quantum dynamical semigroups" Journal of Mathematical Physics 19, 1227 (1978)
- Lindblad G. "On the generators of quantum dynamical semigroups" Communications in Mathematical Physics 48, 119-130 (1976)
- Lurie J. "Higher Topos Theory" Princeton University Press (2009)
10. Summary
Theorems:
- Theorem 3.1 [Т]: Categorically defined φ minimizes the functional (primitivity of the linear part proven)
- Theorem 4.1 [Т]: In the classical limit (, ) the UHM functional reduces to
- Theorem 4.2 [Т]: The classical limit of UHM reproduces Friston's FEP (identification of generative model = definition of self-reference)
- Theorem 4.3 [Т]: Minimization of is identical to minimization of in the classical limit (optimal recognition densities coincide)
- Theorem 5.1 [Т]: for density matrices
Main conclusion: Friston's FEP is not an independent principle but a special case (classical limit) of the more fundamental structure of UHM.
The variational principle is compatible with the octonionic interpretation: the norm of () ensures consistency of the metric used in with the algebraic structure of the state space. Bridge [Т] (T15). See structural derivation.
Related documents:
- Axiom Ω⁷ — definition of φ
- Formalization of φ — constructive realization
- Theories of consciousness — comparison with FEP