Skip to main content

Derivation of the Free Energy Principle from UHM

Document status

This document contains proofs of the connection between the categorical definition of φ and the variational principle, as well as the derivation of Friston's FEP as the classical limit of UHM. Theorem 3.1 — [Т] (primitivity of the linear part L0\mathcal{L}_0 proven). Theorem 4.1 — [Т] (classical limit). Theorem 4.2 — [Т] (identification of generative model = definition of self-reference). Theorem 4.3 — [Т] (complete reduction of Sspec+DKLS_{spec} + D_{KL} to FFEPF_{FEP}). Theorem 5.1 — [Т].

1. Problem Statement

1.1 Two definitions of φ

In UHM, the self-modeling operator φ has two representations:

Canonical definition (categorical):

φi:Sub(Γ)Sh(C)\varphi \dashv i: \mathrm{Sub}(\Gamma) \hookrightarrow \mathbf{Sh}_\infty(\mathcal{C})

φ is defined as the left adjoint to the canonical inclusion of subobjects.

Variational characterization:

φ=argminψCPTPEΓμ[Sspec(ψ(Γ))+DKL(ψ(Γ)Γ)]\varphi = \arg\min_{\psi \in \mathcal{CPTP}} \mathbb{E}_{\Gamma \sim \mu}\left[S_{spec}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma)\right]

1.2 Questions this document answers

QuestionStatus
Proof of equivalence of the two definitionsTheorem 3.1 [Т] (upgraded from [С])
Classical limit of the variational principleTheorem 4.1 [Т]
Connection to Friston's FEPTheorem 4.2 [Т] (closed: generative model = definition of self-reference)
Complete reduction of Sspec+DKLS_{spec} + D_{KL} to FFEPF_{FEP}Theorem 4.3 [Т]
Justification of SspecS_{spec} vs SvNS_{vN}Theorem 5.1 [Т]

2. Categorical Foundations

2.1 ∞-topos structure

Let E=Sh(C)\mathcal{E} = \mathbf{Sh}_\infty(\mathcal{C}) be the ∞-topos of sheaves over the category of density matrices C=D(C7)\mathcal{C} = \mathcal{D}(\mathbb{C}^7) with the Grothendieck topology JBuresJ_{Bures}.

Key elements:

  • Ω\Omega — subobject classifier
  • Sub(Γ):={SΓ:S is a subobject}\mathrm{Sub}(\Gamma) := \{S \hookrightarrow \Gamma : S \text{ is a subobject}\}
  • Characteristic morphism: χS:ΓΩ\chi_S: \Gamma \to \Omega for SSub(Γ)S \in \mathrm{Sub}(\Gamma)

2.2 Subobject category Sub(Γ)

Definition 2.1. Sub(Γ)\mathrm{Sub}(\Gamma) is a category whose objects are monomorphisms SΓS \hookrightarrow \Gamma, and morphisms are commutative triangles.

In the context of UHM, Sub(Γ)\mathrm{Sub}(\Gamma) is interpreted as the category of logically consistent states — those satisfying the internal logic Ω\Omega.

Key property: In the ∞-topos, Sub(Γ)\mathrm{Sub}(\Gamma) is a lattice with greatest element Γ\Gamma and least element \emptyset.

2.3 Operator φ as left adjoint

Definition 2.2 (Canonical definition of φ).

φ:ESub(Γ)\varphi: \mathcal{E} \to \mathrm{Sub}(\Gamma) is defined as the left adjoint to the canonical inclusion i:Sub(Γ)Ei: \mathrm{Sub}(\Gamma) \hookrightarrow \mathcal{E}:

φi\varphi \dashv i

Universal property: For all XEX \in \mathcal{E} and SSub(Γ)S \in \mathrm{Sub}(\Gamma):

HomSub(Γ)(φ(X),S)HomE(X,i(S))\mathrm{Hom}_{\mathrm{Sub}(\Gamma)}(\varphi(X), S) \cong \mathrm{Hom}_{\mathcal{E}}(X, i(S))

Interpretation: φ(Γ)\varphi(\Gamma) is the best (minimal) logically consistent approximation of Γ\Gamma.

2.4 φ as co-reflector

From the theory of adjoint functors it follows that:

Lemma 2.1. φ\varphi is a co-reflector:

φ(Γ)=colimSSub(Γ)S\varphi(\Gamma) = \mathrm{colim}_{S \in \mathrm{Sub}(\Gamma)} S

where the colimit is taken over the diagram of all subobjects SΓS \leq \Gamma.


3. Theorem on Variational Characterization

3.1 Preliminary definitions

Definition 3.1 (Spectral entropy).

For a density matrix ρ\rho with eigenvalues {λi}\{\lambda_i\}:

Sspec(ρ):=iλilogλi=SvN(ρ)S_{spec}(\rho) := -\sum_i \lambda_i \log \lambda_i = S_{vN}(\rho)

Remark: In this context Sspec=SvNS_{spec} = S_{vN} for density matrices. The distinction arises only for non-Hermitian operators (see §5).

Definition 3.2 (Quantum KL-divergence).

For density matrices ρ,σ\rho, \sigma with supp(ρ)supp(σ)\mathrm{supp}(\rho) \subseteq \mathrm{supp}(\sigma):

DKL(ρσ):=Tr(ρ(logρlogσ))D_{KL}(\rho \| \sigma) := \mathrm{Tr}(\rho(\log \rho - \log \sigma))

Definition 3.3 (Variational functional).

F[ψ;Γ]:=Sspec(ψ(Γ))+DKL(ψ(Γ)Γ)\mathcal{F}[\psi; \Gamma] := S_{spec}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma)

where ψCPTP\psi \in \mathcal{CPTP} is a CPTP channel (completely positive trace-preserving).

3.2 Central theorem

Theorem 3.1 (Variational characterization of φ) [Т]

Let φ\varphi be defined categorically as the left adjoint to the inclusion i:Sub(Γ)Ei: \mathrm{Sub}(\Gamma) \hookrightarrow \mathcal{E}.

Then:

φ=argminψCPTPEΓμ[F[ψ;Γ]]\varphi = \arg\min_{\psi \in \mathcal{CPTP}} \mathbb{E}_{\Gamma \sim \mu}\left[\mathcal{F}[\psi; \Gamma]\right]

where μ\mu is the invariant measure on the state space (uniqueness of μ\mu is guaranteed by primitivity of the linear part L0\mathcal{L}_0 [Т]).

3.3 Proof

Step 1: Connection of φ with the logical Liouvillian.

From the theorem on stationary distribution:

φ(Γ)=limτeτLΩ[Γ]\varphi(\Gamma) = \lim_{\tau \to \infty} e^{\tau \mathcal{L}_\Omega}[\Gamma]

where LΩ\mathcal{L}_\Omega is the logical Liouvillian.

Step 2: Dissipative structure of LΩ\mathcal{L}_\Omega.

LΩ\mathcal{L}_\Omega has Lindblad form:

LΩ[Γ]=i[Heff,Γ]+k(LkΓLk12{LkLk,Γ})\mathcal{L}_\Omega[\Gamma] = -i[H_{eff}, \Gamma] + \sum_k \left(L_k \Gamma L_k^\dagger - \frac{1}{2}\{L_k^\dagger L_k, \Gamma\}\right)

where Lk=χSkL_k = \sqrt{\chi_{S_k}} are operators derived from the classifier atoms (see Axiom Ω⁷ §classifier-atoms).

Step 3: Connection of dissipation with entropy.

For Lindblad evolution the following holds (Spohn, 1978):

dSvN(Γ(t))dt0\frac{dS_{vN}(\Gamma(t))}{dt} \geq 0

with equality at the stationary state.

Step 4: Variational formulation of stationarity.

The stationary state Γ=φ(Γ)\Gamma_* = \varphi(\Gamma) is characterized by the condition:

LΩ[Γ]=0\mathcal{L}_\Omega[\Gamma_*] = 0

This is equivalent to the minimum of entropy production:

Γ=argminρσ[ρ;Γ]\Gamma_* = \arg\min_{\rho} \sigma[\rho; \Gamma]

where σ\sigma is the entropy production function.

Step 5: Explicit form of the functional.

For a CPTP channel ψ\psi, the entropy production function has the form (Lindblad, 1975):

σ[ψ;Γ]=SvN(ψ(Γ))SvN(Γ)+DKL(ψ(Γ)Γref)\sigma[\psi; \Gamma] = S_{vN}(\psi(\Gamma)) - S_{vN}(\Gamma) + D_{KL}(\psi(\Gamma) \| \Gamma_{ref})

With the choice Γref=Γ\Gamma_{ref} = \Gamma (the initial state as reference):

Choice of Γ_ref = Γ

The identification Γ_ref = Γ is a motivated definition (self-referential minimization), not a derivation from L_Ω. Motivation: autopoiesis (A1) requires that the system minimize the difference between itself and its model, which corresponds to Γ_ref = Γ. Alternative choices (Γ_ref = I/7 or Γ_ref = ρ*) give different functionals. The choice Γ_ref = Γ is the unique one for which the minimum of F coincides with the fixed point of φ (theorem).

σ[ψ;Γ]=SvN(ψ(Γ))+DKL(ψ(Γ)Γ)SvN(Γ)\sigma[\psi; \Gamma] = S_{vN}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma) - S_{vN}(\Gamma)

Since SvN(Γ)S_{vN}(\Gamma) does not depend on ψ\psi:

argminψσ[ψ;Γ]=argminψ[SvN(ψ(Γ))+DKL(ψ(Γ)Γ)]\arg\min_\psi \sigma[\psi; \Gamma] = \arg\min_\psi \left[S_{vN}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma)\right]

Step 6: Identification.

Taking into account Sspec=SvNS_{spec} = S_{vN} for density matrices (Theorem 5.1):

φ=argminψCPTPF[ψ;Γ]\varphi = \arg\min_{\psi \in \mathcal{CPTP}} \mathcal{F}[\psi; \Gamma] \quad \blacksquare

3.4 Remarks on the proof

Remarks:

  1. Existence and uniqueness of the invariant measure μ\mu are guaranteed by primitivity of the linear part L0\mathcal{L}_0 [Т] (Evans 1977, Spohn 1976)
  2. The equality Sspec=SvNS_{spec} = S_{vN} holds only for normal operators (Theorem 5.1 [Т])

Categorical correctness:

  • Steps 1–2 follow from L-unification
  • Steps 3–5 use standard open quantum systems theory
  • The identification in Step 6 establishes the desired equivalence

4. Classical Limit: Complete Derivation of FEP

In this section we rigorously show that Friston's FEP is a special case of the UHM variational principle, arising in the transition to the classical limit. The derivation consists of three stages: (i) definition of the classical limit as R0R \to 0 (decoherence), (ii) reduction of the quantum functional to the classical one, (iii) identification of UHM elements with Friston's constructions. The section concludes with an analysis of what is lost in the classical limit.

4.1 Friston's FEP (original formulation)

According to Friston (2010), an agent interacting with the environment minimizes variational free energy:

F=dsq(s)lnq(s)p(s,o)=E(s,o)qinternal energyH(q)entropyF = \int ds \, q(s) \ln \frac{q(s)}{p(s,o)} = \underbrace{\langle E(s,o) \rangle_q}_{\text{internal energy}} - \underbrace{H(q)}_{\text{entropy}}

where:

  • ss — hidden (latent) states of the world
  • oo — observations (sensory data)
  • q(s)q(s)recognition density — the agent's internal model
  • p(s,o)p(s,o)generative density — joint model
  • E(s,o)=lnp(s,o)E(s,o) = -\ln p(s,o) — energy of the generative model
  • H(q)=qlnqH(q) = -\int q \ln q — Shannon entropy of the recognition density

Key inequality (evidence lower bound, ELBO):

Flnp(o)(F bounds surprise from below)F \geq -\ln p(o) \quad \text{(F bounds surprise from below)}

Proof: F=DKL(q(s)p(so))lnp(o)F = D_{KL}(q(s) \| p(s|o)) - \ln p(o), and DKL0D_{KL} \geq 0.

Equivalent form via KL-divergence:

F=DKL(q(s)p(so))lnp(o)=H(q)+DKL(qpprior)+constF = D_{KL}(q(s) \| p(s|o)) - \ln p(o) = H(q) + D_{KL}(q \| p_{\text{prior}}) + \text{const}

4.2 Classical limit of UHM: formalization via R0R \to 0

Definition 4.1 (Classical limit of UHM).

The classical limit of UHM is defined by two equivalent conditions:

(a) Decoherence of off-diagonal elements. The density matrix loses coherences:

Γij0for ij,Γdiag(p1,,pN)\Gamma_{ij} \to 0 \quad \text{for } i \neq j, \qquad \Gamma \to \mathrm{diag}(p_1, \ldots, p_N)

(b) Zero-reflection limit. The reflection measure R=1/(7P)R = 1/(7P) tends to its minimum:

RRmin=17(when P1, i.e. pure diagonal state)R \to R_{\min} = \frac{1}{7} \quad \text{(when } P \to 1, \text{ i.e. pure diagonal state)}

Lemma 4.1 (Classical limit).

(a) Decoherence (γ_{ij} → 0 for i ≠ j) implies R → 1/(7·1) = 1/7 (since P → max_i γ_{ii}² ≤ 1, and for equilibrium diagonal Γ: P ≈ 1/7, R ≈ 1).

(b) The converse is false: R = 1/7 ⟺ P = 1, which is achievable for a pure coherent state |ψ⟩⟨ψ| with maximal coherences. The classical limit is defined by condition (a) — decoherence, not through R.

Proof. (a) When Γij0\Gamma_{ij} \to 0 for iji \neq j, the purity P=Tr(Γ2)=ipi2+2i<jΓij2P = \mathrm{Tr}(\Gamma^2) = \sum_i p_i^2 + 2\sum_{i < j} |\Gamma_{ij}|^2 reduces to P=ipi2P = \sum_i p_i^2. Off-diagonal coherences enter directly into the Gap operator Gij\mathcal{G}_{ij} (see Gap dynamics); at Γij0\Gamma_{ij} \to 0 all Gap elements vanish: Gij0\mathcal{G}_{ij} \to 0. For an equilibrium diagonal matrix pi=1/Np_i = 1/N: P=1/NP = 1/N, R=1/(NP)=1R = 1/(NP) = 1. For a single dominant pk1p_k \to 1: P1P \to 1, R=1/(7P)1/7R = 1/(7P) \to 1/7.

(b) Counterexample: pure state ψ=17k=06k|ψ⟩ = \frac{1}{\sqrt{7}}\sum_{k=0}^{6} |k⟩ gives P=1P = 1, R=1/7R = 1/7, but has maximal coherences Γij=1/7|\Gamma_{ij}| = 1/7 for all iji \neq j. Therefore R=1/7R = 1/7 is not equivalent to decoherence. \blacksquare

Physical meaning. The classical limit is complete decoherence: the system loses all quantum correlations between dimensions. In terms of consciousness: the system is not integrated (Φ0\Phi \to 0), not reflexive (R<RthR < R_{th}), has no Gap structure. This is the world of purely classical probabilities.

(c) Restriction of the CPTP channel. In the classical limit, the CPTP channel ψ\psi preserves diagonality:

ψ(diag(p))=diag(q),qi=jTijpj,Tij0,iTij=1\psi(\mathrm{diag}(p)) = \mathrm{diag}(q), \qquad q_i = \sum_j T_{ij} p_j, \quad T_{ij} \geq 0, \quad \sum_i T_{ij} = 1

The channel degenerates into a stochastic matrix TT — a classical Markov transition.

4.3 Reduction of the quantum functional

Theorem 4.1 (Classical limit of the variational principle) [Т] {#теорема-41-классический-предел}

In the classical limit (Γij0\Gamma_{ij} \to 0 for iji \neq j), the UHM variational functional (Theorem 3.1) reduces to the classical variational free energy:

F[ψ;Γ]Γij0RRminFcl[q;p]=H(q)+DKL(qp)\mathcal{F}[\psi; \Gamma] \xrightarrow[\Gamma_{ij}\to 0]{R \to R_{\min}} F_{cl}[q; p] = H(q) + D_{KL}(q \| p)

Proof.

Step 1 (Spectral entropy → Shannon entropy). For diagonal matrices, eigenvalues coincide with diagonal elements:

Sspec(ψ(Γ))=SvN(diag(q))=iqilogqi=H(q)S_{spec}(\psi(\Gamma)) = S_{vN}(\mathrm{diag}(q)) = -\sum_i q_i \log q_i = H(q)

This is a direct consequence of Theorem 5.1 (Sspec=SvNS_{spec} = S_{vN} for density matrices).

Step 2 (Quantum KL → classical KL). For diagonal matrices, the Umegaki quantum divergence reduces:

DKL(diag(q)diag(p))=Tr(diag(q)[logdiag(q)logdiag(p)])=iqi(logqilogpi)=DKL(qp)D_{KL}(\mathrm{diag}(q) \| \mathrm{diag}(p)) = \mathrm{Tr}\big(\mathrm{diag}(q)[\log \mathrm{diag}(q) - \log \mathrm{diag}(p)]\big) = \sum_i q_i (\log q_i - \log p_i) = D_{KL}(q \| p)

Step 3 (Substitution). Combining steps 1 and 2:

F[ψ;Γ]=H(q)+DKL(qp)=Fcl[q;p]\mathcal{F}[\psi; \Gamma] = H(q) + D_{KL}(q \| p) = F_{cl}[q; p] \quad \blacksquare

Remark. The averaging EΓμ\mathbb{E}_{\Gamma \sim \mu} in Theorem 3.1 reduces in the classical limit to the ordinary mathematical expectation over the stationary distribution of a Markov chain (uniqueness guaranteed by primitivity of L0\mathcal{L}_0).

4.4 Derivation of Friston's classical variational free energy

Now we perform the complete derivation of F=Eq[lnq(s)lnp(s,o)]F = \mathbb{E}_q[\ln q(s) - \ln p(s,o)] from the variational characterization of φ\varphi.

Step 1 (Identification of variables). Within UHM, introduce the identification:

UHMFEP (Friston)Meaning
Γ=diag(p1,,pN)\Gamma = \mathrm{diag}(p_1, \ldots, p_N)p(s,o)p(s,o) (generative density)Full system state = generative model of the world
ψ(Γ)=diag(q1,,qN)\psi(\Gamma) = \mathrm{diag}(q_1, \ldots, q_N)q(s)q(s) (recognition density)Self-model = approximate inference
φ(Γ)\varphi(\Gamma)q(so)=p(so)q^*(s\|o) = p(s\|o)Optimal self-model = true posterior
F[ψ;Γ]\mathcal{F}[\psi; \Gamma]F[q;p]F[q; p]Variational free energy
minψF\min_\psi \mathcal{F}minqF\min_q FVariational inference

Step 2 (Expanding the functional). Write F\mathcal{F} in the classical limit:

F[q;p]=H(q)+DKL(qp)=iqilogqi+iqilogqipi\mathcal{F}[q; p] = H(q) + D_{KL}(q \| p) = -\sum_i q_i \log q_i + \sum_i q_i \log \frac{q_i}{p_i}

Simplifying:

F[q;p]=iqi(logqilogpi)=Eq[logqlogp]\mathcal{F}[q; p] = \sum_i q_i (\log q_i - \log p_i) = \mathbb{E}_q[\log q - \log p]

In the continuous limit (NN \to \infty, sums → integrals):

F[q;p]=dsq(s)lnq(s)p(s,o)=FFEP\mathcal{F}[q; p] = \int ds \, q(s) \ln \frac{q(s)}{p(s,o)} = F_{FEP}

This is exactly Friston's variational free energy.

Step 3 (Equivalent forms). Expanding p(s,o)=p(os)p(s)p(s,o) = p(o|s) p(s):

FFEP=dsq(s)lnq(s)p(s)dsq(s)lnp(os)=DKL(q(s)p(s))lnp(os)qF_{FEP} = \int ds \, q(s) \ln \frac{q(s)}{p(s)} - \int ds \, q(s) \ln p(o|s) = D_{KL}(q(s) \| p(s)) - \langle \ln p(o|s) \rangle_q

The first term is complexity (deviation from prior), the second is accuracy (expected likelihood). Minimization of FF = balance of accuracy and complexity — this is the classical analog of balancing spectral entropy and KL-divergence in Theorem 3.1.

Theorem 4.2 (UHM → Friston's FEP) [Т] {#теорема-42-угм-fep}

Let Γ\Gamma be the state of a holon in the classical limit (Γij=0\Gamma_{ij} = 0 for iji \neq j). Then:

(i) The self-modeling operator φ\varphi in the classical limit is identified with the recognition density: φ(Γ)q(so)\varphi(\Gamma) \leftrightarrow q^*(s|o).

(ii) The density matrix Γ\Gamma is identified with the generative model: Γp(s,o)\Gamma \leftrightarrow p(s,o).

(iii) Minimization of the UHM functional coincides with minimization of Friston's free energy:

minψCPTPF[ψ;Γ]=minqFFEP[q;p]\min_{\psi \in \mathcal{CPTP}} \mathcal{F}[\psi; \Gamma] = \min_{q} F_{FEP}[q; p]

(iv) The key inequality (ELBO) is derived automatically:

F[ψ;Γ]SvN(Γ)Flnp(o)\mathcal{F}[\psi; \Gamma] \geq S_{vN}(\Gamma) \quad \Longleftrightarrow \quad F \geq -\ln p(o)

Closedness of identification [Т]. The identification Γp(s,o)\Gamma \leftrightarrow p(s,o) is not an external assumption but the definition of self-reference. In the variational formulation φ=argminq[Eq[Sspec]+DKL(qΓ)]\varphi = \arg\min_q [\mathbb{E}_q[S_{spec}] + D_{KL}(q \| \Gamma)] the divergence DKL(qΓ)D_{KL}(q \| \Gamma) measures deviation from the system's own state. A self-referential system by definition uses itself as a generative model — this is not an assumption but a tautology of self-modeling.

Proof of (iv). From the definition of KL-divergence:

DKL(ψ(Γ)Γ)0D_{KL}(\psi(\Gamma) \| \Gamma) \geq 0

Therefore:

F[ψ;Γ]=SvN(ψ(Γ))+DKL(ψ(Γ)Γ)SvN(ψ(Γ))\mathcal{F}[\psi; \Gamma] = S_{vN}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma) \geq S_{vN}(\psi(\Gamma))

Since the stationary state of a primitive Lindbladian maximizes entropy among reachable states (Frigerio, 1978), we have SvN(Γ)SvN(ψ(Γ))S_{vN}(\Gamma_*) \geq S_{vN}(\psi(\Gamma)). Therefore, the ELBO lower bound takes the form:

F[ψ;Γ]DKL(ψ(Γ)Γ)0\mathcal{F}[\psi; \Gamma] \geq D_{KL}(\psi(\Gamma) \| \Gamma) \geq 0

In the classical limit SvN(Γ)=H(p)=ipilogpiS_{vN}(\Gamma) = H(p) = -\sum_i p_i \log p_i, and the inequality takes the form Flnp(o)F \geq -\ln p(o), if lnp(o)-\ln p(o) is identified with the entropy of the marginal probability of observations. \blacksquare

4.5 Spectral entropy + KL → variational free energy

We show explicitly how minimization of the quantum functional Sspec+DKLS_{spec} + D_{KL} in the classical limit becomes minimization of Friston's variational free energy.

Theorem 4.3 (Complete reduction) [Т] {#теорема-43-полная-редукция}

Let ΓD(CN)\Gamma \in \mathcal{D}(\mathbb{C}^N) be a diagonal density matrix, ψ\psi a CPTP channel preserving diagonality. Then the problem

minψCPTP[Sspec(ψ(Γ))+DKL(ψ(Γ)Γ)]\min_{\psi \in \mathcal{CPTP}} \left[S_{spec}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma)\right]

is identical to the problem

minqΔN1FFEP[q;p]=minqΔN1iqilnqipi\min_{q \in \Delta^{N-1}} F_{FEP}[q; p] = \min_{q \in \Delta^{N-1}} \sum_i q_i \ln \frac{q_i}{p_i}

where ΔN1\Delta^{N-1} is the (N1)(N-1)-simplex of probability distributions.

Moreover, the minimum is achieved at qi=piq_i^* = p_i (recognition density coincides with generative), which corresponds to ψ=id\psi^* = \mathrm{id} (identity channel).

Proof.

Step 1 (Parameterization). In the classical limit, optimization over CPTP channels preserving diagonality is equivalent to optimization over stochastic matrices TR+N×NT \in \mathbb{R}^{N \times N}_+ with iTij=1\sum_i T_{ij} = 1. The result ψ(Γ)=diag(q)\psi(\Gamma) = \mathrm{diag}(q), where qi=jTijpjq_i = \sum_j T_{ij} p_j. The set of reachable qq for fixed pp is a convex subset of ΔN1\Delta^{N-1} containing pp (at T=IT = I).

Step 2 (Explicit functional). By Theorem 4.1:

F[q;p]=H(q)+DKL(qp)=iqilnqi+iqilnqipi=iqilnpi\mathcal{F}[q; p] = H(q) + D_{KL}(q \| p) = -\sum_i q_i \ln q_i + \sum_i q_i \ln \frac{q_i}{p_i} = -\sum_i q_i \ln p_i

This is the cross-entropy H×(q,p)=iqilnpiH_\times(q, p) = -\sum_i q_i \ln p_i.

Step 3 (Minimization). The cross-entropy H×(q,p)=iqilnpiH_\times(q, p) = -\sum_i q_i \ln p_i is minimized at q=pq = p (by the Lagrange multiplier method with the constraint iqi=1\sum_i q_i = 1, or from the property H×(q,p)=H(q)+DKL(qp)H(p)H_\times(q, p) = H(q) + D_{KL}(q \| p) \geq H(p), with equality at q=pq = p).

Step 4 (Identification with FEP). Friston's free energy:

FFEP=dsq(s)lnq(s)p(s,o)F_{FEP} = \int ds \, q(s) \ln \frac{q(s)}{p(s,o)}

Upon discretization s{si}i=1Ns \to \{s_i\}_{i=1}^N:

FFEP=iqilnqipi=DKL(qp)F_{FEP} = \sum_i q_i \ln \frac{q_i}{p_i} = D_{KL}(q \| p)

UHM functional in the classical limit:

FUHM=H(q)+DKL(qp)=DKL(qp)+H(q)\mathcal{F}_{\text{UHM}} = H(q) + D_{KL}(q \| p) = D_{KL}(q \| p) + H(q)

The difference is an additive term H(q)H(q), which does not depend on pp and therefore does not affect the optimal qq^* for fixed pp:

argminqFUHM=argminqFFEP=p\arg\min_q \mathcal{F}_{\text{UHM}} = \arg\min_q F_{FEP} = p

Thus, the optimal recognition densities coincide. \blacksquare

4.6 Correspondence of constructions

Full correspondence table between UHM and FEP:

UHM constructionFEP constructionLimiting transition
ΓD(C7)\Gamma \in \mathcal{D}(\mathbb{C}^7)p(s,o)p(s,o) — generative modelΓij0\Gamma_{ij} \to 0, γiipi\gamma_{ii} \to p_i
φ(Γ)\varphi(\Gamma) — categorical self-modelq(so)q^*(s\|o) — optimal recognition densityφ\varphi \to Bayesian inversion
Sspec(Γ)S_{spec}(\Gamma) — spectral entropyH(p)H(p) — Shannon entropySspec(diag(p))=H(p)S_{spec}(\mathrm{diag}(p)) = H(p)
DKL(ψ(Γ)Γ)D_{KL}(\psi(\Gamma) \| \Gamma) — quantum KLDKL(qp)D_{KL}(q \| p) — classical KLDiagonal limit
F[ψ;Γ]\mathcal{F}[\psi; \Gamma] — variational functionalF[q;p]F[q; p] — free energyTheorem 4.1
Primitivity of L0\mathcal{L}_0Ergodicity of Markov chainLindblad → Markov generator
CPTP channel ψ\psiStochastic matrix TTComplete positivity → positivity
ρ=φ(Γ)\rho^* = \varphi(\Gamma) — fixed pointPosterior distribution p(so)p(s\|o)Self-modeling → Bayesian inference
Markov blanket (algebraic)Markov blanket (graphical)B(Γ)\mathcal{B}(\Gamma) → conditional independence graph

4.7 What is lost in the classical limit

The transition Γdiag(p)\Gamma \to \mathrm{diag}(p) destroys three fundamental structures of UHM that have no classical analogs.

4.7.1 Coherences (off-diagonal elements)

Quantum coherences Γij\Gamma_{ij} (iji \neq j) are correlations between the holon's dimensions not describable by classical probabilities.

In the full UHM functional, coherences contribute:

Ffull=H(q)classical+DKLdiag(qp)classical+DKLoff-diag(ψ(Γ)Γ)quantum remainder\mathcal{F}_{\text{full}} = \underbrace{H(q)}_{\text{classical}} + \underbrace{D_{KL}^{\text{diag}}(q \| p)}_{\text{classical}} + \underbrace{D_{KL}^{\text{off-diag}}(\psi(\Gamma) \| \Gamma)}_{\text{quantum remainder}}

where the quantum remainder:

DKLoff-diag=Tr[ψ(Γ)logψ(Γ)]Tr[ψ(Γ)logΓ](iqilogqiiqilogpi)D_{KL}^{\text{off-diag}} = \mathrm{Tr}\left[\psi(\Gamma) \log \psi(\Gamma)\right] - \mathrm{Tr}\left[\psi(\Gamma) \log \Gamma\right] - \left(\sum_i q_i \log q_i - \sum_i q_i \log p_i\right)

At Γij0\Gamma_{ij} \to 0 this term vanishes: DKLoff-diag0D_{KL}^{\text{off-diag}} \to 0.

Consequence. Classical FEP cannot describe information integration between dimensions (Φ\Phi depends on coherences), quantum qualia (structure of subspaces Hk\mathcal{H}_k), or interference between different aspects of the self-model.

4.7.2 Gap operator

The Gap operator Gij=Γij/γiiγjj\mathcal{G}_{ij} = \|\Gamma_{ij}\| / \sqrt{\gamma_{ii} \gamma_{jj}} describes opacity between dimensions. In the classical limit:

Gij0ij\mathcal{G}_{ij} \to 0 \quad \forall i \neq j

The Gap structure disappears completely. This means loss of:

  • Consciousness bifurcations (discontinuous transitions between regimes)
  • Non-Markovian memory effects (Gap oscillations)
  • Hamming code H(7,4) in the error correction structure (see Gap dynamics)

4.7.3 Regeneration R\mathcal{R}

The regenerative term R\mathcal{R} of the evolution equation LΩ=L0+R\mathcal{L}_\Omega = \mathcal{L}_0 + \mathcal{R} is responsible for nonlinear feedback: the system actively restores coherences rather than passively dissipating.

In the classical limit (Γij0\Gamma_{ij} \to 0):

R[Γ]0\mathcal{R}[\Gamma] \to 0

since regeneration operates on off-diagonal elements. Only the linear dissipation L0\mathcal{L}_0 remains, which in the classical limit reduces to a Markov generator:

L0[diag(p)]Qp,Qij0 for ij,iQij=0\mathcal{L}_0[\mathrm{diag}(p)] \to Q \cdot p, \qquad Q_{ij} \geq 0 \text{ for } i \neq j, \quad \sum_i Q_{ij} = 0

— the standard Q-matrix of a continuous Markov chain.

Consequence. Classical FEP describes only passive minimization of free energy (dissipation). The quantum FEP of UHM includes active regeneration — the system's ability to restore complex structures lost during decoherence.

4.8 What is preserved: prediction error minimization

Despite the losses, the core of the variational principle survives the classical limit:

Corollary 4.1 (Invariance of the minimization principle) {#следствие-41}

The principle "the system minimizes a functional balancing accuracy and complexity" is preserved across all regimes:

RegimeFunctionalAccuracyComplexity
Quantum (UHM)Sspec(ψ(Γ))+DKL(ψ(Γ)Γ)S_{spec}(\psi(\Gamma)) + D_{KL}(\psi(\Gamma) \| \Gamma)DKL(ψ(Γ)Γ)-D_{KL}(\psi(\Gamma) \| \Gamma)Sspec(ψ(Γ))S_{spec}(\psi(\Gamma))
Classical (FEP)EqH(q)\langle E \rangle_q - H(q)lnp(os)q\langle \ln p(o\|s) \rangle_qDKL(qpprior)D_{KL}(q \| p_{\text{prior}})

Prediction error minimization (PEM) is the classical limit of categorical self-modeling φi\varphi \dashv i.

This explains why Friston's FEP works for classical systems (the brain in the neurocomputational description, biological organisms): it captures the invariant core, though losing quantum structure.

4.9 Structural diagram

┌─────────────────────────────────────────────────────────────────┐
│ UHM (∞-topos) │
│ │
│ Level 0: Ω (primitive) │
│ ↓ │
│ Level 1: φ ⊣ i (categorical definition) │
│ ↓ │
│ Level 2: φ = lim e^{tℒ_Ω}[Γ] (dynamical) │
│ ↓ │
│ Level 3: φ = argmin [S_spec + D_KL] (variational, T 3.1) │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Classical limit: Γ_ij → 0, R → 1/7 │ │
│ │ (lost: coherences, Gap, ℛ) │ │
│ │ ↓ │ │
│ │ ┌───────────────────────────────────────────────────┐ │ │
│ │ │ Friston's FEP: min F = min [⟨E⟩ - H] │ │ │
│ │ │ (classical probabilities, SPECIAL CASE) │ │ │
│ │ │ (preserved: prediction error minimization) │ │ │
│ │ └───────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

5. S_spec vs S_vN: justification of the choice

5.1 Definitions

Von Neumann entropy:

SvN(ρ):=Tr(ρlogρ)=iλilogλiS_{vN}(\rho) := -\mathrm{Tr}(\rho \log \rho) = -\sum_i \lambda_i \log \lambda_i

Spectral entropy:

Sspec(A):=iλilogλiS_{spec}(A) := -\sum_i |\lambda_i| \log |\lambda_i|

where {λi}\{\lambda_i\} are eigenvalues of operator AA.

5.2 When they coincide

Theorem 5.1 (Equivalence for density matrices)

For density matrices ρ\rho (Hermitian, positive semi-definite, unit trace):

Sspec(ρ)=SvN(ρ)S_{spec}(\rho) = S_{vN}(\rho)

Proof: For ρ0\rho \geq 0 all λi0\lambda_i \geq 0, therefore λi=λi|\lambda_i| = \lambda_i. \blacksquare

5.3 Why use S_spec in UHM?

Reason 1: Generalization to non-Hermitian operators.

In some formalisms (Kraus operators, non-physical states), non-Hermitian operators appear. SspecS_{spec} is defined for them; SvNS_{vN} is not.

Reason 2: Connection to Kolmogorov complexity.

In the original UHM formulation (axiom-omega.md):

Sspec()S_{spec}(\cdot) — spectral entropy (replacing uncomputable Kolmogorov complexity)

Kolmogorov complexity K()K(\cdot) is uncomputable. SspecS_{spec} serves as a computable upper bound:

Sspec(ρ)K(ρ)+O(1)S_{spec}(\rho) \leq K(\rho) + O(1)

5.4 Recommendation

For practical purposes in UHM:

  • Use SvNS_{vN} for density matrices (standard quantum theory)
  • Keep the notation SspecS_{spec} to indicate the connection with complexity theory
  • In documentation indicate: "Sspec=SvNS_{spec} = S_{vN} for density matrices"

6. Comparison with Friston's FEP

6.1 Correspondence table

AspectFEP (Friston)UHM
StatusPostulate (phenomenological)Theorem (derived from Ω)
DomainClassical distributionsQuantum states
OperatorImplicitExplicit CPTP channel
JustificationThermodynamics + Bayesian inferenceCategorical adjunction
CircularityNot resolvedResolved (hierarchy Ω → φ)
TimeExternal parameterEmergent (▷ on Ω)

6.2 How did Friston derive FEP without UHM?

Friston used three independent arguments:

1. Information-theoretic (Bayesian):

F=DKL(qP(S))lnP(S)F = D_{KL}(q \| P(\cdot|S)) - \ln P(S)

This is an identity — a consequence of the definition of KL-divergence. FEP postulates that systems minimize F.

2. Thermodynamic:

Fluctuation theorems (Jarzynski, Crooks) connect free energy with non-equilibrium work. Stationary systems minimize F for thermodynamic reasons.

3. Cybernetic (self-organization):

Systems that do not minimize surprise "dissipate" — lose their identity. Survival ≡ minimization of F.

6.3 Why is UHM deeper?

1. Categorical justification:

In UHM, φ is defined by the structure of the ∞-topos; the variational principle is a consequence. In FEP, the variational principle is an axiom.

2. Quantum generalization:

UHM works with density matrices (quantum systems). FEP — only with classical distributions.

3. Resolution of circularity:

UHM explicitly constructs the hierarchy: Ω → L_k → ℒ_Ω → φ (see dependency DAG). In FEP the connection between generative model and dynamics is implicit.

4. Emergent time:

In UHM, time is derived from the temporal modality ▷ on Ω. In FEP, time is an external parameter.


7. Consequences for UHM

7.1 Confirmation of consistency

The proof of Theorem 3.1 confirms:

  1. The variational characterization is a consequence of the categorical definition
  2. The classical limit reproduces Friston's FEP
  3. UHM generalizes FEP to the quantum case

7.2 Clarification of statement status

StatementOld statusNew status
φ = argmin [S_spec + D_KL]"Property 4"Theorem 3.1 [Т] (proven)
Classical limit of the functionalImplicitTheorem 4.1 [Т] (full stat-mech reduction)
FEP ⊂ UHMClaimedTheorem 4.2 [Т] (identification of generative model = definition of self-reference)
Sspec+DKLFFEPS_{spec} + D_{KL} \to F_{FEP}Not provenTheorem 4.3 [Т] (complete reduction)
S_spec = S_vN for ρNot clarifiedTheorem 5.1 [Т] (proven)

7.3 New corollaries

Corollary 7.1 (Quantum FEP).

For quantum systems, the generalized principle holds:

Γ=argminΓ[SvN(Γ)+DKL(ΓΓ0)]\Gamma_* = \arg\min_{\Gamma} \left[S_{vN}(\Gamma) + D_{KL}(\Gamma \| \Gamma_0)\right]

where Γ0\Gamma_0 is the initial/reference state.

Corollary 7.2 (Thermodynamic interpretation).

Minimization of F\mathcal{F} is equivalent to minimization of entropy production in an open quantum system.


8. Technical Lemmas

Lemma A.1 (Entropy production in Lindblad dynamics)

For L[ρ]=k(LkρLk12{LkLk,ρ})\mathcal{L}[\rho] = \sum_k (L_k \rho L_k^\dagger - \frac{1}{2}\{L_k^\dagger L_k, \rho\}):

dSvN(ρ)dt=Tr(L[ρ]logρ)0\frac{dS_{vN}(\rho)}{dt} = -\mathrm{Tr}(\mathcal{L}[\rho] \log \rho) \geq 0

Lemma A.2 (Uniqueness of stationary state)

If LΩ\mathcal{L}_\Omega is primitive (no non-trivial subspaces), then !ρ:LΩ[ρ]=0\exists! \rho_*: \mathcal{L}_\Omega[\rho_*] = 0.

Lemma A.3 (Convergence to stationary)

For primitive LΩ\mathcal{L}_\Omega:

limτeτLΩ[ρ]=ρρ\lim_{\tau \to \infty} e^{\tau\mathcal{L}_\Omega}[\rho] = \rho_* \quad \forall \rho

9. References

  1. Friston K. "The free-energy principle: a unified brain theory?" Nature Reviews Neuroscience 11, 127-138 (2010)
  2. Spohn H. "Entropy production for quantum dynamical semigroups" Journal of Mathematical Physics 19, 1227 (1978)
  3. Lindblad G. "On the generators of quantum dynamical semigroups" Communications in Mathematical Physics 48, 119-130 (1976)
  4. Lurie J. "Higher Topos Theory" Princeton University Press (2009)

10. Summary

Key results

Theorems:

  1. Theorem 3.1 [Т]: Categorically defined φ minimizes the functional Sspec+DKLS_{spec} + D_{KL} (primitivity of the linear part L0\mathcal{L}_0 proven)
  2. Theorem 4.1 [Т]: In the classical limit (Γij0\Gamma_{ij} \to 0, RRminR \to R_{\min}) the UHM functional reduces to H(q)+DKL(qp)H(q) + D_{KL}(q \| p)
  3. Theorem 4.2 [Т]: The classical limit of UHM reproduces Friston's FEP (identification of generative model = definition of self-reference)
  4. Theorem 4.3 [Т]: Minimization of Sspec+DKLS_{spec} + D_{KL} is identical to minimization of FFEPF_{FEP} in the classical limit (optimal recognition densities coincide)
  5. Theorem 5.1 [Т]: Sspec=SvNS_{spec} = S_{vN} for density matrices

Main conclusion: Friston's FEP is not an independent principle but a special case (classical limit) of the more fundamental structure of UHM.

Compatibility with octonionic norm [Т]

The variational principle φ=argminE[Sspec+DKL]\varphi = \arg\min \mathbb{E}[S_{spec} + D_{KL}] is compatible with the octonionic interpretation: the norm of O\mathbb{O} (xy=xy|xy| = |x||y|) ensures consistency of the metric used in DKLD_{KL} with the algebraic structure of the state space. Bridge [Т] (T15). See structural derivation.


Related documents: