Excess Entropy Production as a Candidate Universal Cost of Persistence: A Thermodynamic Foundation for the Attractor Framework; Robert Galida (July 2026) [F]

Abstract

Every dissipative system maintains its attractor through continuous reconfiguration. Reconfiguration requires work; work generates entropy. The recovery rate $κ$ κ — corrective permeability — is the rate at which a system reconfigures to return to its attractor after perturbation. This paper proposes that $κ$ κ is a measure of excess entropy generation rate.

We develop an abstract persistence cost framework and prove its equivalence to Lyapunov theory. We then identify entropy production as a physical realization of this cost, deriving: $κ = \inf_{x} \frac{δ (x)}{\int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t}$ κ=xinf∫0∞σexcess(ϕt(x))dtδ(x)

where $σ_{excess} = σ - σ_{s s}$ σexcess=σ−σss is the excess entropy production rate above the system’s steady-state baseline. For physical systems, the baseline is zero (equilibrium); for biological, cognitive, and social systems, the baseline is the steady-state dissipation rate of the healthy, well-coordinated attractor.

This unifies physical, biological, cognitive, and social systems. The framework is grounded in the second law of thermodynamics and non-equilibrium steady-state thermodynamics, not analogy. Empirical predictions are provided for each domain.

Keywords: entropy generation, excess entropy production, corrective permeability, attractor framework, dissipative structures, reconfiguration, Lyapunov theory, free energy principle, allostatic load

1. Introduction

The attractor framework defines persistence as the ability of a system to maintain its attractor under perturbation. Historically, persistence has been measured kinematically — as distance traveled or time spent away from equilibrium. This paper proposes that the true cost of persistence is thermodynamic: it is the excess entropy generated during reconfiguration and recovery.

Every dissipative system maintains its attractor through continuous reconfiguration. A bacterium reconfigures its metabolism to maintain homeostasis. A brain reconfigures its synaptic connections to maintain predictive models. A society reconfigures its institutions to maintain order. Reconfiguration requires work; work generates entropy. The second law of thermodynamics applies at every level of organization.

We develop an abstract persistence cost framework first, establishing its equivalence to Lyapunov theory. We then identify entropy production as a physical realization of this cost, deriving the relationship between corrective permeability and excess entropy generation.

The framework unifies physical, biological, cognitive, and social systems. It is grounded in the second law of thermodynamics and non-equilibrium steady-state thermodynamics, not analogy.

2. The Persistence Cost Functional

Let $X$ X be a state space, $ϕ_{t} (x)$ ϕt(x) the flow of a dynamical system, and $A \subseteq X$ A⊆X an attractor set. Let $δ (x) = d (x, A)$ δ(x)=d(x,A) be the distance from $x$ x to the attractor. For a treatment of state-space constraints in viability theory, see Aubin (1991).

Definition 1 (Persistence Cost Functional): A persistence cost functional $C (x)$ C(x) is a scalar function on $X$ X satisfying:

$C (x) \geq 0$ C(x)≥0 for all $x$ x
$C (x) = 0$ C(x)=0 if and only if $x \in A$ x∈A
$C (ϕ_{t} (x)) \in L^{1} ([0, \infty))$ C(ϕt(x))∈L1([0,∞)) for all $x$ x in the basin

Definition 2 (Cumulative Persistence Cost): For a finite horizon $T > 0$ T>0: $D_{T} (x) = \int_{0}^{T} C (ϕ_{t} (x)) d t$ DT(x)=∫0TC(ϕt(x))dt

For trajectories that converge to the attractor: $D_{\infty} (x) = \int_{0}^{\infty} C (ϕ_{t} (x)) d t$ D∞(x)=∫0∞C(ϕt(x))dt

3. Existence and Lyapunov Equivalence

Theorem 1 (Existence of the Persistence Functional): Assume $C (x) \geq 0$ C(x)≥0, $C = 0$ C=0 only on $A$ A, and $C (ϕ_{t} (x)) \in L^{1} ([0, \infty))$ C(ϕt(x))∈L1([0,∞)) for all $x$ x in the basin. Assume $f$ f is locally Lipschitz, the flow is continuously differentiable in the initial condition, and $C$ C is continuous and locally bounded. Then:

$D_{\infty} (x) = \int_{0}^{\infty} C (ϕ_{t} (x)) d t$ D∞(x)=∫0∞C(ϕt(x))dt exists and is finite.
$D_{\infty}$ D∞ is continuous.
$D_{\infty}$ D∞ satisfies the transport equation:

$\nabla D_{\infty} (x) \cdot f (x) = - C (x)$ ∇D∞(x)⋅f(x)=−C(x)

Proof: The integral exists and is finite by the $L^{1}$ L1 assumption. Continuity follows from the dominated convergence theorem under the stated regularity assumptions. To derive the transport equation, compute: $D (ϕ_{h} (x)) = \int_{h}^{\infty} C (ϕ_{t} (x)) d t = D (x) - \int_{0}^{h} C (ϕ_{t} (x)) d t$ D(ϕh(x))=∫h∞C(ϕt(x))dt=D(x)−∫0hC(ϕt(x))dt

Then: $\frac{D (ϕ_{h} (x)) - D (x)}{h} = - \frac{1}{h} \int_{0}^{h} C (ϕ_{t} (x)) d t \to - C (x)$ hD(ϕh(x))−D(x)=−h1∫0hC(ϕt(x))dt→−C(x)

as $h \to 0$ h→0. By the chain rule: $\nabla D (x) \cdot f (x) = - C (x)$ ∇D(x)⋅f(x)=−C(x) $□$ □

Corollary (Equivalence to Lyapunov Theory): Any Lyapunov function $V (x)$ V(x) (with $V \geq 0$ V≥0, $V = 0$ V=0 on the attractor, and $\dot{V} \leq 0$ V˙≤0) yields a persistence cost $C (x) = - \dot{V} (x)$ C(x)=−V˙(x). Conversely, any persistence cost $C (x)$ C(x) satisfying $\nabla D \cdot f = - C$ ∇D⋅f=−C defines a Lyapunov function $D (x)$ D(x).

Proof: If $V$ V is a Lyapunov function, then $\dot{V} = \nabla V \cdot f \leq 0$ V˙=∇V⋅f≤0. Define $C = - \dot{V}$ C=−V˙. Then $C \geq 0$ C≥0, $C = 0$ C=0 on the attractor, and $D_{T} = \int C = V (x) - V (ϕ_{T} (x))$ DT=∫C=V(x)−V(ϕT(x)). Conversely, if $\nabla D \cdot f = - C$ ∇D⋅f=−C, then $\dot{D} = - C \leq 0$ D˙=−C≤0, so $D$ D is a Lyapunov function. $□$ □

Interpretation: The persistence cost framework is mathematically equivalent to classical Lyapunov stability theory. For the connection to contraction analysis, see Lohmiller & Slotine (1998). For control Lyapunov functions, see Freeman & Kokotovic (1996). Entropy production is one physically meaningful realization of the cost function $C$ C. For a detailed treatment of Lipschitz continuity of $D_{\infty}$ D∞ under a Lipschitz-flow hypothesis, see Galida (2026a), Proposition 4.

4. Entropy Production as Persistence Cost

4.1 Entropy Balance

For an open system, the entropy balance equation is: $\frac{d S_{system}}{d t} = σ - Φ$ dtdSsystem=σ−Φ

where $σ \geq 0$ σ≥0 is the entropy production rate (always non-negative by the second law) and $Φ$ Φ is the entropy export rate to the environment. For foundational treatments of stochastic thermodynamics and entropy production, see Seifert (2012) and Sekimoto (2010).

For a system in a steady state: $\frac{d S_{system}}{d t} = 0 ⟹ σ = Φ$ dtdSsystem=0⟹σ=Φ

4.2 Excess Entropy Production

Define the steady-state entropy production rate $σ_{s s}$ σss as the rate when the system is at its attractor.

Define the excess entropy production rate: $σ_{excess} (x) = σ (x) - σ_{s s} (x)$ σexcess(x)=σ(x)−σss(x)

Assumption (Excess Entropy Decay): For all trajectories in the basin, there exist constants $C < \infty$ C<∞ and $μ > 0$ μ>0 such that: $σ_{excess} (ϕ_{t} (x)) \leq C e^{- μ t} σ_{excess} (x)$ σexcess(ϕt(x))≤Ce−μtσexcess(x)

for all $t \geq 0$ t≥0. This ensures $D_{\infty} (x) < \infty$ D∞(x)<∞ and is the standard hypothesis under which the persistence functional and its associated bounds are well-defined, consistent with Galida (2026a, 2026b). The decay rate $μ$ μ may be domain-specific and is empirically measurable.

Note on generalization: The exponential decay assumption is adopted here to ensure finiteness of $D_{\infty}$ D∞ and to maintain consistency with the prior papers in this series. Generalization to $L^{1}$ L1 integrable decays (e.g., algebraic) is a priority for future work.

4.3 The Entropy Persistence Functional

Definition 3 (Cumulative Excess Entropy Functional): For a finite horizon $T > 0$ T>0: $D_{T} (x) = \int_{0}^{T} σ_{excess} (ϕ_{t} (x)) d t$ DT(x)=∫0Tσexcess(ϕt(x))dt

For trajectories that converge to the attractor: $D_{\infty} (x) = \int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t$ D∞(x)=∫0∞σexcess(ϕt(x))dt

Interpretation: The persistence functional is the total excess entropy generated during reconfiguration and recovery.

4.4 Corrective Permeability

Definition 4 (Corrective Permeability): $κ = \inf_{x \in B ∖ A} \frac{δ (x)}{D_{\infty} (x)}$ κ=x∈B∖AinfD∞(x)δ(x)

where $δ (x) = d (x, A)$ δ(x)=d(x,A) is the distance to the attractor.

Interpretation: $κ$ κ is the minimum excess entropy cost per unit distance. It measures the efficiency of reconfiguration: a system that returns with minimal excess entropy generation has high $κ$ κ; a system that generates excess entropy has low $κ$ κ.

4.5 Basin Depth

Proposition 1 (Properties of Basin Depth): Define $B = D_{\infty} (saddle)$ B=D∞(saddle), where $saddle$ saddle is the lowest point on the basin boundary (the separatrix between attractors). For the connection to large-deviation theory and escape rates, see Freidlin & Wentzell (2012). Then:

$B \geq 0$ B≥0, with equality iff the basin has no barrier (i.e., the boundary coincides with the attractor).
For gradient systems $\dot{x} = - \nabla V (x)$ x˙=−∇V(x), $B = V (saddle) - V (A)$ B=V(saddle)−V(A) (the classical energy barrier).
$B$ B is invariant under smooth coordinate changes (coordinate invariance).
$B$ B depends on the chosen persistence cost functional $C$ C; different costs yield different barriers.

Proof: (1) follows from non-negativity of $D_{\infty}$ D∞. (2) follows from the transport equation $\nabla D \cdot f = - C$ ∇D⋅f=−C and the identity $f = - \nabla V$ f=−∇V. (3) follows from the invariance of the integral under diffeomorphisms. (4) is self-evident.

5. Domain-Specific Realizations

5.1 Physical Systems: Thermodynamic Excess Entropy

For a thermodynamic system, $S (x) = k_{B} \log Ω (x)$ S(x)=kBlogΩ(x), where $Ω (x)$ Ω(x) is the number of microstates. For an isolated system, $σ_{s s} = 0$ σss=0 (equilibrium), so $σ_{excess} = σ = \dot{S}$ σexcess=σ=S˙. $κ = \inf_{x} \frac{δ (x)}{S (A) - S (x)}$ κ=xinfS(A)−S(x)δ(x)

Example: A gas returning to equilibrium after compression. The entropy generated is $Δ S = n R \log (V_{f} / V_{i})$ ΔS=nRlog(Vf/Vi).

5.2 Biological Systems: Metabolic Excess Entropy

For a biological system, $S (x)$ S(x) is the metabolic entropy. The baseline $σ_{s s}$ σss is the resting metabolic rate (homeostasis). The excess is: $σ_{excess} = metabolic rate - resting metabolic rate$ σexcess=metabolic rate−resting metabolic rate $κ = \inf_{x} \frac{δ (x)}{\int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t}$ κ=xinf∫0∞σexcess(ϕt(x))dtδ(x)

Example: A cell returning to homeostasis after a nutrient shock. The excess entropy generated is the metabolic cost of restoring homeostasis above baseline. For the dissipative-structures framework underlying biological self-organization, see Nicolis & Prigogine (1989).

5.3 Cognitive Systems: Free Energy Dissipation

For a cognitive system, variational free energy $F = - \log p (y ∣ x) + D_{KL} [q (\cdot) ∥ p (\cdot ∣ x)]$ F=−logp(y∣x)+DKL[q(⋅)∥p(⋅∣x)] is adopted here as one candidate persistence functional. We do not claim variational free energy is uniquely correct; it is adopted as the most developed existing candidate persistence functional for cognitive systems. Other candidates (Bayesian surprise, expected free energy, predictive information) are possible; this paper focuses on $F$ F due to its established role in the free-energy principle (Friston, 2010). For the thermodynamics of information and its connection to free-energy minimization, see Parrondo, Horowitz & Sagawa (2015) and Sagawa & Ueda (2008).

The baseline $σ_{s s}$ σss is the baseline neural dissipation rate (resting brain activity). The excess is: $σ_{excess} = \dot{F} - {\dot{F}}_{s s}$ σexcess=F˙−F˙ss $κ = \inf_{x} \frac{δ (x)}{\int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t}$ κ=xinf∫0∞σexcess(ϕt(x))dtδ(x)

Example: A cognitive system updating its beliefs after a prediction error. The excess entropy generated is the free energy dissipated during belief updating above baseline.

5.4 Social Systems: Coordination Excess Entropy

For a social system, define the aggregate social entropy production rate as: $σ^{social} (t) = \sum_{i} ({\dot{S}}_{i} (t) - {\dot{S}}_{i}^{rest})$ σsocial(t)=i∑(S˙i(t)−S˙irest)

where ${\dot{S}}_{i} (t)$ S˙i(t) is the total entropy production rate of individual $i$ i, and ${\dot{S}}_{i}^{rest}$ S˙irest is the individual’s baseline entropy production rate in a resting, minimally socially constrained state. This is measured via physiological proxies such as basal metabolic rate, resting allostatic load, or cortisol baseline (McEwen, 1998; Sterling & Eyer, 1988).

Interpretation: $σ^{social}$ σsocial measures the excess dissipation attributable to social constraints: the additional entropy generated by coordination, communication, conflict, norm enforcement, and institutional friction.

Non-Negativity: Unlike total entropy production ${\dot{S}}_{i} \geq 0$ S˙i≥0 (which follows from the second law), $σ_{i}^{social}$ σisocial is not guaranteed to be non-negative. Division of labor, infrastructure, and specialization may reduce an individual’s metabolic burden relative to a solitary baseline. The hypothesis is that during recovery from social disruption, $σ_{i}^{social} \geq 0$ σisocial≥0; in steady-state, $σ_{i}^{social} \to 0$ σisocial→0. This is an empirical claim, not a theorem.

The baseline $σ_{s s}$ σss is the steady-state social entropy production rate (well-coordinated society). The excess is: $σ_{excess} = σ^{social} - σ_{s s}$ σexcess=σsocial−σss $κ = \inf_{x} \frac{δ (x)}{\int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t}$ κ=xinf∫0∞σexcess(ϕt(x))dtδ(x)

Example: A society recovering from a shock (economic crisis, political upheaval). The excess entropy generated is the coordination cost of restructuring above baseline. A harmonious society has $σ_{excess} = 0$ σexcess=0; a turbulent society has $σ_{excess} > 0$ σexcess>0; a chronically turbulent society may have settled into a new attractor with a higher $σ_{s s}$ σss. This illustrates the framework’s central distinction: the attractor is the state of minimum entropy generation for that class of system.

6. The Unified Framework

6.1 Summary Table

Domain	Entropy Functional	Baseline $σ_{s s}$ σss	Excess $σ_{excess}$ σexcess	Recovery Rate $κ$ κ
Physical	Thermodynamic entropy	0 (equilibrium)	$\dot{S}$ S˙	$\inf \frac{δ}{Δ S}$ infΔSδ
Biological	Metabolic entropy	Resting metabolic rate	Metabolic rate — resting	$\inf \frac{δ}{\int σ_{excess} d t}$ inf∫σexcessdtδ
Cognitive	Free energy	Baseline neural dissipation	$\dot{F} - {\dot{F}}_{s s}$ F˙−F˙ss	$\inf \frac{δ}{\int σ_{excess} d t}$ inf∫σexcessdtδ
Social	Social entropy production	Steady-state social dissipation	$σ^{social} - σ_{s s}$ σsocial−σss	$\inf \frac{δ}{\int σ_{excess} d t}$ inf∫σexcessdtδ

6.2 The Universal Structure

Every domain follows the same mathematical structure:

Component	Expression
Excess entropy production	$σ_{excess} (x) = σ (x) - σ_{s s}$ σexcess(x)=σ(x)−σss
Cumulative cost	$D_{\infty} (x) = \int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t$ D∞(x)=∫0∞σexcess(ϕt(x))dt
Recovery rate	$κ = \inf_{x} δ (x) / D_{\infty} (x)$ κ=infxδ(x)/D∞(x)
Basin depth	$B = D_{\infty} (saddle)$ B=D∞(saddle)
Transport equation	$\nabla D \cdot f = - σ_{excess}$ ∇D⋅f=−σexcess

6.3 The Low-Energy Attractor Benchmark (Proposed Hypothesis)

We propose the following benchmark as an additional hypothesis: the attractor is the state of minimum entropy generation for that class of system.

Domain	Attractor	Entropy Generation at Attractor
Physical	Equilibrium	$σ = 0$ σ=0
Biological	Homeostasis	$σ = σ_{s s} > 0$ σ=σss>0 (resting metabolism)
Cognitive	Settled Belief	$σ = σ_{s s} > 0$ σ=σss>0 (baseline neural dissipation)
Social	Coordinated Order	$σ = σ_{s s} > 0$ σ=σss>0 (baseline institutional friction)

Interpretation:

For equilibrium systems (gases, isolated systems), the attractor is the state where entropy generation reaches zero — the system has nowhere lower to go.
For dissipative systems (cells, brains, societies), the attractor is the state where entropy generation reaches its lowest non-zero steady-state value — the minimum entropy generation the system can sustain while maintaining its functional organization.

Important caveats:

This is a proposed benchmark, not a derived theorem.
For cognitive systems in particular, minimizing entropy production rate (a thermodynamic quantity) and minimizing free energy/surprise (the actual claim in the free-energy principle) are distinct minimization principles. The framework does not establish a bridge between them; this is an open question.
The benchmark is an empirical hypothesis that requires domain-specific validation.

In all cases, the attractor is the lowest entropy-generating state that system can have while remaining itself.

7. Testable Predictions

7.1 Core Prediction

Prediction: The recovery rate $κ$ κ is inversely proportional to the excess entropy generated during reconfiguration: $κ \propto \frac{1}{D_{\infty}}$ κ∝D∞1

Falsification: If a system returns to its attractor with high excess entropy generation but high recovery rate, the prediction is falsified.

7.2 Secondary Prediction

Prediction: Systems that maintain their attractor with minimal excess entropy generation are more “efficient.” Systems that generate excess entropy are “inefficient” or “stressed.”

Falsification: If an inefficient system has lower excess entropy generation than an efficient system, the prediction is falsified.

7.3 Domain-Specific Predictions

Domain	Prediction	Falsification
Physical	$κ$ κ correlates with thermal efficiency	$κ$ κ high but efficiency low
Biological	$κ$ κ correlates with metabolic efficiency	$κ$ κ high but metabolic cost high
Cognitive	$κ$ κ correlates with learning efficiency	$κ$ κ high but learning cost high
Social	$κ$ κ correlates with institutional efficiency	$κ$ κ high but coordination cost high

8. Experimental Design

8.1 Physical Systems

System: Gas in a piston
Perturbation: Compression
Measurement: Excess entropy generation (heat measurement) and recovery time
Test: Correlation between $κ$ κ and $1 / D_{\infty}$ 1/D∞

8.2 Biological Systems

System: Cell culture
Perturbation: Nutrient shock
Measurement: Metabolic rate above resting (oxygen consumption) and recovery time
Test: Correlation between $κ$ κ and metabolic cost

8.3 Cognitive Systems

System: Human participants in a learning task
Perturbation: Prediction error
Measurement: Free energy dissipation above baseline (EEG complexity, pupil dilation) and belief updating rate
Test: Correlation between $κ$ κ and free energy dissipation

8.4 Social Systems

System: Institutional response to shocks
Perturbation: Economic or political crisis
Measurement: Social entropy production above baseline (allostatic load, cortisol, institutional friction) and recovery time
Test: Correlation between $κ$ κ and social entropy production

9. Open Questions

Question	Status	Difficulty
Q1: Uniqueness of S(x)S(x)	Are there multiple valid entropy functionals for a given domain?	Hard
Q2: Variational principle	Is there a universal variational principle that yields $S (x)$ S(x)?	Hard
Q3: Social second law	Does $σ^{social} \geq 0$ σsocial≥0 always hold during recovery?	Very Hard
Q4: Cross-level entropy	How does entropy generation at one level relate to entropy generation at another?	Hard
Q5: Measurement	Can we measure excess entropy generation in cognitive and social systems directly?	Moderate
Q6: Unification	Can all domain-specific entropy functionals be derived from a single universal functional?	Very Hard

10. Conclusion

Every dissipative system maintains its attractor through continuous reconfiguration. Reconfiguration requires work; work generates excess entropy. The recovery rate $κ$ κ — corrective permeability — is the rate at which a system reconfigures to return to its attractor after perturbation. We have proposed that $κ$ κ is a measure of excess entropy generation rate.

We developed an abstract persistence cost framework and proved its equivalence to Lyapunov theory. We then identified entropy production as a physical realization of this cost, deriving: $κ = \inf_{x} \frac{δ (x)}{\int_{0}^{\infty} σ_{excess} (ϕ_{t} (x)) d t}$ κ=xinf∫0∞σexcess(ϕt(x))dtδ(x)

where $σ_{excess} = σ - σ_{s s}$ σexcess=σ−σss is the excess entropy production rate above the system’s steady-state baseline — thermodynamic entropy for physical systems, metabolic entropy for biological systems, free energy dissipation for cognitive systems, and social entropy production for social systems.

We proposed a unified benchmark: the attractor is the state of minimum entropy generation for that class of system — zero for equilibrium systems, non-zero steady-state for dissipative systems. This provides a unified criterion for identifying attractors across domains: an attractor is a state from which the system cannot reduce its entropy generation further without losing its defining structure or function.

This unifies physical, biological, cognitive, and social systems. In each domain, persistence requires reconfiguration; reconfiguration generates excess entropy; $κ$ κ measures the entropy cost of that reconfiguration. The framework is grounded in the second law of thermodynamics and non-equilibrium steady-state thermodynamics, not analogy.

Social Application: The framework provides a thermodynamic interpretation of social dynamics: harmony is a low-entropy attractor state; turbulence is a high-entropy state generated by excess dissipation during reconfiguration. The recovery rate $κ$ κ measures how efficiently a society transitions from turbulence back to harmony — that is, how quickly it reduces its excess entropy production to zero.

11. Limitations

This paper establishes an abstract persistence cost framework with a proposed thermodynamic realization. Several limitations should be explicitly acknowledged:

Uniqueness. Entropy production is not proved to be the unique persistence cost. Many positive functionals $C (x)$ C(x) satisfy $\nabla D \cdot f = - C$ ∇D⋅f=−C. The identification of entropy production as the canonical cost is a physically motivated hypothesis, not a mathematical theorem.
Scope. The framework does not imply that all domains obey thermodynamics literally. The cognitive and social realizations are proposed hypotheses requiring empirical validation.
Decay assumption. Exponential decay of $σ_{excess}$ σexcess is a sufficient assumption to ensure finiteness of $D_{\infty}$ D∞, not a necessary one. Generalization to $L^{1}$ L1 integrable decays (e.g., algebraic) is a priority for future work.
Basin depth. Basin depth $B = D_{\infty} (saddle)$ B=D∞(saddle) is defined in terms of the persistence cost functional. Its relationship to classical energy barriers is established only for gradient systems.
Empirical validation. The predictions of the framework — particularly the inverse relationship between $κ$ κ and $D_{\infty}$ D∞ — remain to be tested empirically across domains.
Low-energy attractor benchmark. The benchmark proposed in §6.3 is a hypothesis, not a derived theorem. For cognitive systems, it risks conflating thermodynamic entropy production with free-energy minimization — distinct principles whose relationship remains open.

References

Aubin, J. P. (1991). Viability Theory. Birkhäuser.

Boltzmann, L. (1877). “Über die Beziehung zwischen dem zweiten Hauptsatz der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung.” Wiener Berichte, 76, 373-435.

Clausius, R. (1865). “Über verschiedene für die Anwendung bequeme Formen der Hauptgleichungen der mechanischen Wärmetheorie.” Annalen der Physik, 125(7), 353-400.

Freeman, R. A., & Kokotovic, P. V. (1996). Robust Nonlinear Control Design: State-Space and Lyapunov Techniques. Birkhäuser.

Freidlin, M. I., & Wentzell, A. D. (2012). Random Perturbations of Dynamical Systems (3rd ed.). Springer.

Friston, K. (2010). “The free-energy principle: a unified brain theory?” Nature Reviews Neuroscience, 11(2), 127-138.

Galida, R. (2026a). “The Persistence Functional: A Candidate Formal Foundation for the Attractor Framework.” Fantasy Attractor.

Galida, R. (2026b). “Deriving Corrective Permeability from the Cumulative Deviation Functional.” Fantasy Attractor.

Jaynes, E. T. (1957). “Information Theory and Statistical Mechanics.” Physical Review, 106(4), 620-630.

Khalil, H. K. (2002). Nonlinear Systems (3rd ed.). Prentice Hall.

Kondepudi, D., & Prigogine, I. (1998). Modern Thermodynamics: From Heat Engines to Dissipative Structures. Wiley.

Lohmiller, W., & Slotine, J. J. E. (1998). “On contraction analysis for non-linear systems.” Automatica, 34(6), 683-696.

Lyapunov, A. M. (1892). The General Problem of the Stability of Motion.

McEwen, B. S. (1998). “Stress, Adaptation, and Disease: Allostasis and Allostatic Load.” Annals of the New York Academy of Sciences, 840(1), 33-44.

Nicolis, G., & Prigogine, I. (1989). Exploring Complexity: An Introduction. W. H. Freeman.

Parrondo, J. M. R., Horowitz, J. M., & Sagawa, T. (2015). “Thermodynamics of information.” Nature Physics, 11(2), 131-139.

Prigogine, I. (1947). Étude Thermodynamique des Phénomènes Irréversibles. Dunod.

Prigogine, I., & Nicolis, G. (1977). Self-Organization in Non-Equilibrium Systems. Wiley.

Sagawa, T., & Ueda, M. (2008). “Second law of thermodynamics with discrete quantum feedback control.” Physical Review Letters, 100(8), 080403.

Seifert, U. (2012). “Stochastic thermodynamics, fluctuation theorems and molecular machines.” Reports on Progress in Physics, 75(12), 126001.

Sekimoto, K. (2010). Stochastic Energetics. Springer.

Shannon, C. E. (1948). “A Mathematical Theory of Communication.” Bell System Technical Journal, 27(3), 379-423.

Sterling, P., & Eyer, J. (1988). “Allostasis: A New Paradigm to Explain Arousal Pathology.” In Handbook of Life Stress, Cognition and Health, 629-649.

Suggested citation: Galida, R. S. (2026). Excess Entropy Production as a Candidate Universal Cost of Persistence: A Thermodynamic Foundation for the Attractor Framework. Fantasy Attractor.

The Persistence Functional: A Candidate Formal Foundation for the Attractor Framework; Robert Galida (July 2026) [F]

Abstract

The attractor framework provides a domain-general vocabulary for describing persistence and change across physical, biological, cognitive, and social systems. However, its core variables— $κ$ κ (corrective permeability), $B$ B (basin depth), and $R$ R (reality alignment)—have been defined inconsistently across application papers, and their formal relationships have remained implicit. This paper proposes a candidate mathematical formalization for the framework.

The central mathematical innovation of this paper is treating persistence as a functional defined over trajectories— $D_{T} (x) = \int_{0}^{T} d (ϕ_{τ} (x), A) d τ$ DT(x)=∫0Td(ϕτ(x),A)dτ—rather than as a scalar property of states. We prove several mathematical properties of $D_{T}$ DT, including non-negativity, monotonicity in $T$ T, additivity, Lipschitz continuity with respect to initial conditions, and a bound relating $D_{\infty}$ D∞ to the recovery rate $κ$ κ: $D_{\infty} (x) \leq \frac{C}{κ} d (x, A)$ D∞(x)≤κCd(x,A). We establish connections to dynamic programming and ergodic theory via occupation measures. We introduce a complementary topological persistence functional $P_{topo} (t)$ Ptopo(t), which measures the lifetime of topological features in the trajectory’s state-space geometry, and the topological evolution rate $E (t)$ E(t).

We unify the framework’s variable set: $κ$ κ is the recovery rate (operationalized as $1 / τ$ 1/τ); $γ$ γ is a proposed drift rate for persistent chaos, grounded in the literature on high-dimensional neural networks; $B$ B is the energy barrier (basin depth); $\tilde{B}$ B~ is a complementary persistence depth; $R$ R is the expected log predictive likelihood. We propose testable predictions linking $E (t)$ E(t) to $κ$ κ and $γ$ γ, and provide a falsifiable experimental protocol using neural network training and persistent homology.

The paper offers a candidate formal foundation, with explicit definitions, mathematical properties, and empirical grounding. All unverified sources are clearly labeled as such.

Keywords: attractor framework, persistence functional, cumulative deviation, topological persistence, corrective permeability, basin depth, reality alignment, persistent homology

1. Introduction

The attractor framework has been applied across physics (hydrogen decay, Jeans instability), biology (ECM mechanics, HRV), cognition (belief updating, performance attractors), and social systems (religious attractors, civilizational dynamics). A common vocabulary has emerged: $κ$ κ (corrective permeability), $B$ B (basin depth), and $R$ R (reality alignment). However, these variables have been defined inconsistently across papers, and their formal relationships have remained implicit. This paper proposes a candidate mathematical formalization that addresses these inconsistencies.

The central mathematical innovation of this paper is treating persistence as a functional defined over trajectories rather than as a scalar property of states. $D_{T} (x) = \int_{0}^{T} d (ϕ_{τ} (x), A) d τ$ DT(x)=∫0Td(ϕτ(x),A)dτ can be understood as a type of action functional (carefully qualified). Like the classical action $\int L (q, \dot{q}) d t$ ∫L(q,q˙)dt, it assigns a scalar to an entire trajectory, is additive under concatenation, and suggests variational and optimal-control interpretations. However, it is not the mechanical action; it is a cumulative deviation functional that measures time away from equilibrium. This moves the framework into the domain of trajectory-level analysis, aligning it with modern dynamical systems and geometric control theory.

We introduce the cumulative deviation functional $D_{T} (x)$ DT(x) as this central object, and we establish its mathematical properties, including its relationship to the recovery rate $κ$ κ. We introduce a complementary topological persistence functional $P_{topo} (t)$ Ptopo(t) and the topological evolution rate $E (t)$ E(t). We unify the framework’s variable set with operational definitions and propose testable predictions with falsification criteria.

1.1 Scope and Status

This paper is a candidate formalization—it provides definitions, mathematical properties, and empirical hypotheses. It is not a completed empirical validation; that is the subject of future work. All claims are labeled as definitions (part of the formal structure), propositions/theorems (proved), hypotheses (testable predictions), or heuristics (suggestive connections not yet formalized). This distinction is maintained throughout.

2. Formal Definitions

Let $X$ X be a metric space with distance function $∥ \cdot ∥$ ∥⋅∥. Let $ϕ_{τ} (x)$ ϕτ(x) be the flow of a dynamical system starting from state $x \in X$ x∈X at time $τ = 0$ τ=0. Let $A \subseteq X$ A⊆X be an attractor set (a compact, invariant set to which trajectories converge). Assume the flow is continuous and measurable so that $d (ϕ_{τ} (x), A)$ d(ϕτ(x),A) is measurable. The flow $ϕ_{τ}$ ϕτ satisfies the semigroup property $ϕ_{t + s} = ϕ_{t} \circ ϕ_{s}$ ϕt+s=ϕt∘ϕs for all $t, s \geq 0$ t,s≥0, with $ϕ_{0} = id$ ϕ0=id. We assume $d (ϕ_{τ} (x), A) \in L^{1} ([0, T])$ d(ϕτ(x),A)∈L1([0,T]) for all finite $T$ T, so the integral defining $D_{T}$ DT is well-defined.

Define the distance from a point to the attractor: $d (x, A) = \inf_{a \in A} ∥ x - a ∥$ d(x,A)=a∈Ainf∥x−a∥

The definition applies to any metric space; for infinite-dimensional spaces, the usual measurability and integrability conditions are assumed.

2.1 Cumulative Deviation Functional

Definition 1 (Cumulative Deviation Functional): For a finite horizon $T > 0$ T>0, the cumulative deviation functional is: $D_{T} (x) = \int_{0}^{T} d (ϕ_{τ} (x), A) d τ$ DT(x)=∫0Td(ϕτ(x),A)dτ

Interpretation: $D_{T} (x)$ DT(x) is the total accumulated deviation from the attractor over the interval $[0, T]$ [0,T]. It measures integrated error, residence-time-weighted distance, or accumulated regret. This is not a path length; it measures time spent away from equilibrium, whereas path length $\int ∥ {\dot{ϕ}}_{τ} (x) ∥ d τ$ ∫∥ϕ˙τ(x)∥dτ measures distance traveled.

Domain generality: This definition applies to any system with a well-defined state space, a flow, and an attractor set. It does not require linearity, differentiability, or specific functional forms.

Empirical note: $D_{T}$ DT is the fundamental object for empirical work; $D_{\infty}$ D∞ is primarily an analytical limit used for theoretical bounds.

Note: $D_{T}$ DT is not a Lyapunov function. A Lyapunov function is a scalar function of the current state; $D_{T}$ DT is a functional of the entire trajectory. It does not decrease monotonically along trajectories, and it does not provide pointwise stability information. Its purpose is to measure accumulated history, not instantaneous energy.

Occupation measure connection: Define the occupation measure of the trajectory up to time $T$ T as: $μ_{T} (B) = \int_{0}^{T} 1_{B} (ϕ_{τ} (x)) d τ$ μT(B)=∫0T1B(ϕτ(x))dτ

for measurable $B \subseteq X$ B⊆X. Then: $D_{T} (x) = \int_{X} d (y, A) d μ_{T} (y)$ DT(x)=∫Xd(y,A)dμT(y)

Thus $D_{T}$ DT is the expected distance to the attractor under the occupation measure. This connects the functional directly to ergodic theory and occupation measure analysis. For foundational treatments of occupation measures and invariant measures, see Ruelle (1989) and Bowen (1975).

2.1.1 Why the L¹ Trajectory Functional?

The choice of the L¹ integral over alternatives is motivated by the following properties:

Linearity: Each moment contributes equally; accumulation is additive over time.
Physical units: For systems with a natural distance metric, $D_{T}$ DT has units of distance × time, which is interpretable as accumulated deviation.
Simplicity: It is the simplest nontrivial trajectory functional that is not a path length.
Analogy: It mirrors cumulative regret and occupation measures in control theory and ergodic theory.
Avoidance of overweighting: Unlike $d^{2}$ d2, it does not disproportionately weight large deviations; unlike max, it is sensitive to the full trajectory.

This is one natural choice; other functionals (e.g., $d^{p}$ dp, exponentially weighted integrals) could be substituted without changing the framework’s structure.

2.2 Topological Persistence Functional

Let $X_{τ} = {ϕ_{s} (x) : s \in [0, τ]}$ Xτ={ϕs(x):s∈[0,τ]} be the trajectory segment up to time $τ$ τ. Let ${PH}_{k} (X_{τ})$ PHk(Xτ) be the $k$ k-dimensional persistent homology of the point cloud $X_{τ}$ Xτ at scale $ϵ$ ϵ. Each feature (component, loop, void) has a birth scale $b$ b and a death scale $d$ d, with persistence $d - b$ d−b. For foundational treatments of persistent homology, see Edelsbrunner & Harer (2010) or Carlsson (2009).

Definition 2 (Topological Persistence Functional): We define the following complementary topological persistence functional. For $t \geq 0$ t≥0: $P_{topo} (t) = \int_{0}^{t} \sum_{k \geq 0} \sum_{(b, d) \in {PH}_{k} (X_{τ})} (d - b) d τ$ Ptopo(t)=∫0tk≥0∑(b,d)∈PHk(Xτ)∑(d−b)dτ

The map $τ \mapsto {PH}_{k} (X_{τ})$ τ↦PHk(Xτ) is piecewise constant on intervals where the trajectory does not cross a homology-critical threshold. Assuming the trajectory crosses such thresholds at discrete times, the integral is well-defined as a sum of piecewise continuous segments. This is the standard assumption in time-varying persistent homology (see Carlsson & Zomorodian, 2009).

Interpretation: $P_{topo} (t)$ Ptopo(t) is the total lifetime of all topological features in the trajectory’s state-space geometry up to time $t$ t. This is a separate mathematical object from $D_{T}$ DT; the relationship between them is an empirical hypothesis. This is one possible choice among several topological summaries (e.g., persistence landscapes, persistence images) and is selected because it mirrors the cumulative interpretation of $D_{T}$ DT, rather than because it is uniquely canonical. Other stable summaries—such as persistence landscapes, persistence images, or Betti curves—could be substituted for the present functional without changing the framework’s structure.

Measurement: In practice, $P_{topo} (t)$ Ptopo(t) is computed by sampling the trajectory at discrete times, computing persistent homology on latent activation manifolds, and summing the persistence of all features using standard libraries (e.g., GUDHI, Ripser). Turner & Barak (2023) demonstrated that trained RNNs develop attractors sequentially during training; the topological structure of these attractors can be analyzed using persistent homology.

Falsification: If persistent homology features do not correlate with any behavioral or dynamical measure in a given system, $P_{topo}$ Ptopo is not a useful construct for that domain.

2.3 Topological Evolution Rate

Definition 3 (Topological Evolution Rate): For a learning system with time-dependent topological persistence, the topological evolution rate is defined as: $E (t) = \frac{d}{d t} P_{topo} (t)$ E(t)=dtdPtopo(t)

where differentiable, and experimentally as $E (t) \approx \frac{Δ P_{topo}}{Δ t}$ E(t)≈ΔtΔPtopo over finite intervals.

Interpretation: $E (t)$ E(t) measures how quickly the system’s topological complexity changes during learning. Negative $E (t)$ E(t) indicates topological simplification (compression); positive $E (t)$ E(t) indicates increasing complexity (expansion); $E (t) \approx 0$ E(t)≈0 indicates stagnation. Learning is one possible cause of topological change; random drift, noise, or chaotic wandering can also change topology.

Empirical anchor: Karuppiah, Nazreen Banu et al. (2026) examine the evolution of topological signatures during training. Turner & Barak (2023) show that RNNs develop attractors sequentially, which may correspond to phases of topological simplification. We hypothesize that successful learning corresponds to negative average values of $E (t)$ E(t) over defined phases, but this is a testable claim, not a definition.

3. Mathematical Properties of the Cumulative Deviation Functional

This section establishes the mathematical behavior of $D_{T}$ DT, providing the foundation for its use in the framework.

3.1 Non-negativity

Proposition 1 (Non-negativity): For any $x \in X$ x∈X and any $T \geq 0$ T≥0: $D_{T} (x) \geq 0$ DT(x)≥0

with equality iff $ϕ_{τ} (x) \in A$ ϕτ(x)∈A for almost all $τ \in [0, T]$ τ∈[0,T].

Proof: The integrand is a distance function $d (ϕ_{τ} (x), A)$ d(ϕτ(x),A), which is non-negative by definition. The integral of a non-negative function is non-negative. Equality holds only if the integrand is zero almost everywhere.

3.2 Monotonicity in $T$ T

Proposition 2 (Monotonicity): For fixed $x$ x, $D_{T} (x)$ DT(x) is monotonically non-decreasing in $T$ T: $D_{T_{2}} (x) \geq D_{T_{1}} (x) for T_{2} \geq T_{1}$ DT2(x)≥DT1(x)for T2≥T1

Proof: For $T_{2} \geq T_{1}$ T2≥T1: $D_{T_{2}} (x) = \int_{0}^{T_{1}} d (ϕ_{τ} (x), A) d τ + \int_{T_{1}}^{T_{2}} d (ϕ_{τ} (x), A) d τ$ DT2(x)=∫0T1d(ϕτ(x),A)dτ+∫T1T2d(ϕτ(x),A)dτ

The second integral is non-negative by Proposition 1. Therefore $D_{T_{2}} (x) \geq D_{T_{1}} (x)$ DT2(x)≥DT1(x).

Corollary: If the trajectory converges exactly to the attractor at time $τ_{0} < T$ τ0<T, then: $D_{T} (x) = D_{τ_{0}} (x) for all T \geq τ_{0}$ DT(x)=Dτ0(x)for all T≥τ0

3.3 Additivity

Proposition 3 (Additivity): For any $T, S \geq 0$ T,S≥0: $D_{T + S} (x) = D_{T} (x) + D_{S} (ϕ_{T} (x))$ DT+S(x)=DT(x)+DS(ϕT(x))

Proof: $\begin{aligned} D_{T + S} (x) & = \int_{0}^{T + S} d (ϕ_{τ} (x), A) d τ \\ = \int_{0}^{T} d (ϕ_{τ} (x), A) d τ + \int_{T}^{T + S} d (ϕ_{τ} (x), A) d τ \\ = D_{T} (x) + \int_{0}^{S} d (ϕ_{τ + T} (x), A) d τ \\ = D_{T} (x) + \int_{0}^{S} d (ϕ_{τ} (ϕ_{T} (x)), A) d τ (by the semigroup property) \\ = D_{T} (x) + D_{S} (ϕ_{T} (x)) \end{aligned}$ DT+S(x)=∫0T+Sd(ϕτ(x),A)dτ=∫0Td(ϕτ(x),A)dτ+∫TT+Sd(ϕτ(x),A)dτ=DT(x)+∫0Sd(ϕτ+T(x),A)dτ=DT(x)+∫0Sd(ϕτ(ϕT(x)),A)dτ(by the semigroup property)=DT(x)+DS(ϕT(x))

This connects $D_{T}$ DT naturally to Bellman equations, dynamic programming, and occupation measures.

3.4 Heuristic Connection: Dynamic Programming

The additivity property $D_{T + S} (x) = D_{T} (x) + D_{S} (ϕ_{T} (x))$ DT+S(x)=DT(x)+DS(ϕT(x)) suggests a natural connection to dynamic programming. For a controlled system $\dot{X} = f (X, u)$ X˙=f(X,u) with control $u \in U$ u∈U, the value function $V (x) = \inf_{u} D_{\infty} (x)$ V(x)=infuD∞(x) would formally satisfy the Hamilton-Jacobi-Bellman equation: $0 = \inf_{u} {d (x, A) + \nabla V (x) \cdot f (x, u)}$ 0=uinf{d(x,A)+∇V(x)⋅f(x,u)}

This is a standard result for additive cost functionals. A full derivation for the specific functional $D_{T}$ DT is left for future work. This section is a heuristic connection, not a formal result.

3.5 Lipschitz Continuity with Respect to Initial Conditions

Proposition 4 (Lipschitz Continuity of DTDT): Suppose the flow $ϕ_{τ}$ ϕτ is Lipschitz continuous in $x$ x with constant $L$ L, i.e., $∥ ϕ_{τ} (x) - ϕ_{τ} (y) ∥ \leq e^{L τ} ∥ x - y ∥$ ∥ϕτ(x)−ϕτ(y)∥≤eLτ∥x−y∥. Then for any $x, y$ x,y in the basin of $A$ A: $∣ D_{T} (x) - D_{T} (y) ∣ \leq \int_{0}^{T} e^{L τ} d τ ∥ x - y ∥ = \frac{e^{L T} - 1}{L} ∥ x - y ∥$ ∣DT(x)−DT(y)∣≤∫0TeLτdτ∥x−y∥=LeLT−1∥x−y∥

Proof: First, note that the distance function $d (\cdot, A)$ d(⋅,A) is 1-Lipschitz: for any $x, y \in X$ x,y∈X, $∣ d (x, A) - d (y, A) ∣ \leq ∥ x - y ∥$ ∣d(x,A)−d(y,A)∣≤∥x−y∥

This follows from the triangle inequality and the definition of the infimum. Then, using the Lipschitz property of the flow: $\begin{aligned} ∣ D_{T} (x) - D_{T} (y) ∣ & \leq \int_{0}^{T} ∣ d (ϕ_{τ} (x), A) - d (ϕ_{τ} (y), A) ∣ d τ \\ \leq \int_{0}^{T} ∥ ϕ_{τ} (x) - ϕ_{τ} (y) ∥ d τ \\ \leq \int_{0}^{T} e^{L τ} ∥ x - y ∥ d τ \\ = \frac{e^{L T} - 1}{L} ∥ x - y ∥ \end{aligned}$ ∣DT(x)−DT(y)∣≤∫0T∣d(ϕτ(x),A)−d(ϕτ(y),A)∣dτ≤∫0T∥ϕτ(x)−ϕτ(y)∥dτ≤∫0TeLτ∥x−y∥dτ=LeLT−1∥x−y∥

Interpretation: This proposition guarantees that empirical estimates of $D_{T}$ DT are robust under small perturbations of initial conditions and establishes that $D_{T}$ DT defines a continuous functional on the basin of attraction. This is essential for numerical estimation and experimental measurement.

3.6 Instantaneous Growth Rate

Remark 1 (Instantaneous Growth Rate): If the integrand $d (ϕ_{τ} (x), A)$ d(ϕτ(x),A) is continuous in $τ$ τ, then: $\frac{d}{d T} D_{T} (x) = d (ϕ_{T} (x), A)$ dTdDT(x)=d(ϕT(x),A)

This follows directly from the Fundamental Theorem of Calculus.

3.7 Ergodic Limit

Proposition 5 (Ergodic Limit): Suppose the normalized occupation measure $ν_{T} = μ_{T} / T$ νT=μT/T converges weakly to an invariant probability measure $μ$ μ as $T \to \infty$ T→∞. Then: $\lim_{T \to \infty} \frac{1}{T} D_{T} (x) = \int_{X} d (y, A) d μ (y)$ T→∞limT1DT(x)=∫Xd(y,A)dμ(y)

Proof: From the occupation measure representation $D_{T} (x) = \int d (y, A) d μ_{T} (y) = T \int d (y, A) d ν_{T} (y)$ DT(x)=∫d(y,A)dμT(y)=T∫d(y,A)dνT(y), weak convergence of $ν_{T}$ νT to $μ$ μ and boundedness/continuity of $d (\cdot, A)$ d(⋅,A) gives the result.

This is the pointwise ergodic theorem applied to the observable $d (\cdot, A)$ d(⋅,A). For the ergodic theory of dynamical systems, see Bowen (1975) and Ruelle (1989).

3.8 Bound under Exponential Stability

Theorem 2 (Bound under Exponential Stability): Suppose the flow $ϕ_{τ} (x)$ ϕτ(x) converges to the attractor $A$ A with exponential rate $κ > 0$ κ>0: $d (ϕ_{τ} (x), A) \leq C e^{- κ τ} d (x, A)$ d(ϕτ(x),A)≤Ce−κτd(x,A)

for some constant $C < \infty$ C<∞, for all $τ \geq 0$ τ≥0. Then: $D_{\infty} (x) = \int_{0}^{\infty} d (ϕ_{τ} (x), A) d τ \leq \frac{C}{κ} d (x, A)$ D∞(x)=∫0∞d(ϕτ(x),A)dτ≤κCd(x,A)

Proof: $D_{\infty} (x) = \int_{0}^{\infty} d (ϕ_{τ} (x), A) d τ \leq \int_{0}^{\infty} C e^{- κ τ} d (x, A) d τ$ D∞(x)=∫0∞d(ϕτ(x),A)dτ≤∫0∞Ce−κτd(x,A)dτ $= C d (x, A) \int_{0}^{\infty} e^{- κ τ} d τ = \frac{C}{κ} d (x, A)$ =Cd(x,A)∫0∞e−κτdτ=κCd(x,A)

Corollary: For linearly stable systems with recovery rate $κ$ κ, $D_{\infty} (x) \leq \frac{1}{κ} d (x, A)$ D∞(x)≤κ1d(x,A) (when $C = 1$ C=1).

Important: Exponential stability implies $D_{\infty} < \infty$ D∞<∞. The converse is not claimed; polynomial convergence can also yield finite $D_{\infty}$ D∞.

3.9 Recovery Rate Bound

Corollary 1 (Recovery Rate Bound): For a system satisfying the exponential stability hypothesis with constant $C$ C, the recovery rate $κ$ κ satisfies: $κ \leq \frac{C d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)Cd(x,A)

For systems with $C = 1$ C=1 (e.g., normal/symmetric linearizations with no transient overshoot), this reduces to: $κ \leq \frac{d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)d(x,A)

Proof: From Theorem 2, we have $D_{\infty} (x) \leq \frac{C}{κ} d (x, A)$ D∞(x)≤κCd(x,A). Rearranging gives $κ \leq \frac{C d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)Cd(x,A). When $C = 1$ C=1, this reduces to $κ \leq \frac{d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)d(x,A).

Interpretation: Small cumulative deviation implies rapid recovery (large $κ$ κ). Large cumulative deviation implies slow recovery (small $κ$ κ). This formalizes the intuitive link between $D_{T}$ DT and $κ$ κ. The $C$ C factor accounts for possible transient overshoot in non-normal systems.

3.10 Finite Horizon Approximation

Proposition 6 (Finite Horizon): For any $ϵ > 0$ ϵ>0, there exists a finite $T_{ϵ}$ Tϵ such that for all $T > T_{ϵ}$ T>Tϵ: $∣ D_{T} (x) - D_{\infty} (x) ∣ \leq ϵ$ ∣DT(x)−D∞(x)∣≤ϵ

Proof: This follows directly from Theorem 2 under the exponential stability hypothesis. Since the integrand decays exponentially, the tail integral $\int_{T}^{\infty} d (ϕ_{τ} (x), A) d τ$ ∫T∞d(ϕτ(x),A)dτ can be made arbitrarily small by choosing $T$ T sufficiently large.

3.11 Summary of Properties

Property	Statement
Non-negativity	$D_{T} (x) \geq 0$ DT(x)≥0
Monotonicity	$D_{T_{2}} (x) \geq D_{T_{1}} (x)$ DT2(x)≥DT1(x) for $T_{2} \geq T_{1}$ T2≥T1
Additivity	$D_{T + S} (x) = D_{T} (x) + D_{S} (ϕ_{T} (x))$ DT+S(x)=DT(x)+DS(ϕT(x))
Lipschitz continuity	(	D_T(x) – D_T(y)	\leq \frac{e^{LT} – 1}{L} \|x – y\| )
Instantaneous growth	$\frac{d}{d T} D_{T} (x) = d (ϕ_{T} (x), A)$ dTdDT(x)=d(ϕT(x),A)
Ergodic limit	$\lim_{T \to \infty} \frac{1}{T} D_{T} (x) = \int d (y, A) d μ (y)$ limT→∞T1DT(x)=∫d(y,A)dμ(y)
Exponential stability implies finite D∞D∞	$D_{\infty} (x) \leq \frac{C}{κ} d (x, A)$ D∞(x)≤κCd(x,A)
Recovery bound (general)	$κ \leq \frac{C d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)Cd(x,A)
Recovery bound (C=1)	$κ \leq \frac{d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)d(x,A)
Finite horizon approximation	$D_{T} (x) \to D_{\infty} (x)$ DT(x)→D∞(x) as $T \to \infty$ T→∞

4. The Unified Variable Set

The following variables are defined operationally. Where a variable is a proposal, that is stated explicitly.

4.1 Corrective Permeability ( $κ$ κ)

Definition 4 (Corrective Permeability): $κ$ κ is the recovery rate of the system to its attractor after a small perturbation. Operationally estimated as $κ = 1 / τ$ κ=1/τ under approximately exponential relaxation, where $τ$ τ is the characteristic recovery time constant. This coincides with the exponential convergence exponent in the linearized regime and is consistent with the original definition in the attractor framework.

Relationship to DTDT: From Corollary 1, for a system with initial deviation $d (x, A)$ d(x,A), $κ \leq \frac{C d (x, A)}{D_{\infty} (x)}$ κ≤D∞(x)Cd(x,A).

Note on κ’s status: In this paper, κ is treated as a primitive empirical regime parameter. A stronger theory would derive κ from $D_{T}$ DT and system geometry; this remains an open direction for future work.

4.2 Drift Rate ( $γ$ γ) — A Proposed Distinction

Definition 5 (Drift Rate): We propose the following operational distinction between dynamical regimes, based on the dominant Lyapunov exponent $λ_{\max}$ λmax:

Regime	$λ_{\max}$ λmax	$κ$ κ	$γ$ γ	Behavior
Stable attractor	$< - 0.01$ <−0.01	$> 0$ >0	$0$ 0	Converges to fixed point
Persistent chaos	$\approx 0$ ≈0	$\approx 0$ ≈0	$> 0$ >0	Wanders without convergence
Full chaos	$> 0$ >0	undefined	$> 0$ >0	Diverges

Thresholds: $λ_{\max} < - 0.01$ λmax<−0.01, $∣ λ_{\max} ∣ \leq 0.01$ ∣λmax∣≤0.01, and $λ_{\max} > 0.01$ λmax>0.01 (pre-registered, measured in units of 1/epoch). These numerical thresholds are illustrative defaults rather than theoretically privileged constants.

Grounding: This distinction is inspired by the literature on chaos in high-dimensional neural networks (Engelken, Wolf & Abbott, 2023; Sompolinsky, Crisanti & Sommers, 1988; Clark, Abbott & Litwin-Kumar, 2023; Fournier & Urbani, 2023). For the treatment of stochastic and random perturbations, see Arnold (1998).

Falsification: If $κ$ κ and $γ$ γ are perfectly correlated (i.e., systems with small $κ$ κ always have small $γ$ γ), the distinction is not useful.

4.3 Basin Depth ( $B$ B) and Persistence Depth ( $\tilde{B}$ B~)

Definition 6a (Basin Depth — Energy Barrier): $B$ B is the energy barrier required to escape the basin, measured as the potential difference between the attractor and the saddle point on the basin boundary: $B = V (saddle) - V (attractor)$ B=V(saddle)−V(attractor)

This preserves the original definition from earlier papers.

Definition 6b (Persistence Depth): As a complementary measure, we define: $\tilde{B} = \min_{x \in \partial B} D_{T} (x)$ B~=x∈∂BminDT(x)

This is the cumulative deviation required to reach the basin boundary. The relationship between $B$ B and $\tilde{B}$ B~ remains an open mathematical question.

Operational alternative: In practice, the basin boundary may not be well-defined. Estimate $B$ B via the Arrhenius relationship $P_{escape} \propto e^{- B / T}$ Pescape∝e−B/T, where $T$ T is the noise level.

4.4 Reality Alignment ( $R$ R)

Definition 7 (Reality Alignment): $R$ R is the expected log predictive likelihood: $R = E [\log p (y ∣ X)]$ R=E[logp(y∣X)]

where $p (y ∣ X)$ p(y∣X) is the system’s predictive distribution over outcomes $y$ y given state $X$ X. Higher $R$ R indicates better predictive accuracy. This is a standard measure of predictive performance; the label “reality alignment” is a philosophical interpretation.

Direction-dependence: The framework interprets $R$ R as potentially direction-dependent: $R_{A \to B} \neq R_{B \to A}$ RA→B=RB→A. This captures the asymmetry found in Berglund et al. (2024), where models trained on “A is B” fail to generalize to “B is A.” This interpretation is a framework-level claim.

Note on integration: Among the core variables, $R$ R is the least integrated with the trajectory-based formalism. Unlike $κ$ κ, $B$ B, and $\tilde{B}$ B~, which are directly derived from or related to $D_{T}$ DT, $R$ R is imported from Bayesian statistics. A more complete theoretical derivation of $R$ R from the same dynamical principles—perhaps as an information-theoretic functional of the occupation measure—remains an open direction for future work.

5. Theoretical Framework

5.1 Relationship Between $D_{T}$ DT, $P_{topo}$ Ptopo, and $E (t)$ E(t)

Functional	What It Measures	Regime
$D_{T} (x)$ DT(x)	Cumulative deviation from attractor	All systems
$P_{topo} (t)$ Ptopo(t)	Topological feature lifetime	Systems with topological structure
$E (t)$ E(t)	Rate of topological change	Learning systems

Hypothesis: In learning systems, $D_{T}$ DT and $P_{topo}$ Ptopo are positively correlated early in learning and negatively correlated late in learning. Turner & Barak (2023) demonstrate that RNNs develop attractors sequentially during training, which may correspond to phases of topological simplification. This is a testable prediction.

5.2 Relationship Between $κ$ κ, $γ$ γ, and $E (t)$ E(t)

Hypothesis: In a learning system, the topological evolution rate $E (t)$ E(t) is monotonically related to $κ$ κ only if the system is not in persistent chaos: $\partial E / \partial κ > 0$ ∂E/∂κ>0 (with $E$ E and $κ$ κ measured on appropriate scales) in convergent regimes. In persistent chaos, $E (t)$ E(t) is monotonically related to $γ$ γ: $\partial E / \partial γ > 0$ ∂E/∂γ>0. Correlation analysis provides a statistical test of these monotonicity relationships.

5.3 Adaptive Landscape (Heuristic Note)

The adaptive landscape $V (X, t)$ V(X,t) evolves as: $\dot{V} = g (X, V) - λ V + ξ (t)$ V˙=g(X,V)−λV+ξ(t)

For gradient systems with $\dot{X} = - \nabla_{X} V (X)$ X˙=−∇XV(X), and assuming the dynamics remain within the basin where higher-order nonlinearities are negligible, the cumulative deviation functional can be approximated as: $D_{T} (x) \approx \int_{0}^{T} ∥ \nabla_{X} V (ϕ_{τ} (x), τ) ∥ d τ$ DT(x)≈∫0T∥∇XV(ϕτ(x),τ)∥dτ

This is a local heuristic. A full derivation and integration into the core formalism is left for future work.

6. Testable Predictions

6.1 Core Prediction

Prediction: In a learning system, $E (t)$ E(t) is monotonically related to $κ$ κ in convergent regimes: $\partial E / \partial κ > 0$ ∂E/∂κ>0 (with $E$ E and $κ$ κ measured on appropriate scales), and $\partial E / \partial γ > 0$ ∂E/∂γ>0 in persistent chaos. Correlation analysis provides a statistical test of this monotonicity: $Corr (E (t), κ) > 0 ⟺ λ_{\max} < 0$ Corr(E(t),κ)>0⟺λmax<0 $Corr (E (t), γ) > 0 ⟺ λ_{\max} \approx 0$ Corr(E(t),γ)>0⟺λmax≈0

Falsification: If $E (t)$ E(t) correlates with $κ$ κ in all regimes, or with $γ$ γ in all regimes, the prediction is falsified.

6.2 Secondary Prediction

Prediction: In systems with high $R$ R, $D_{T}$ DT and $P_{topo}$ Ptopo are negatively correlated late in learning; in systems with low $R$ R, they are uncorrelated or positively correlated.

Falsification: If $D_{T}$ DT and $P_{topo}$ Ptopo are negatively correlated in both high-R and low-R systems, the prediction is falsified.

6.3 Boundary Condition and Global Falsifier

Conjecture: We conjecture that the framework applies to any system satisfying:

A. Well-defined state space.
B. Subject to perturbations.
C. Exhibits at least one identifiable attractor.
D. Dynamics are observable and measurable.

Global Falsifier: The unified ontology claim collapses if a system is found where $D_{T}$ DT, $κ$ κ, and topological persistence are mutually independent across all regimes, and where $R$ R cannot be expressed as a functional of the trajectory or occupation measure. If such a system exists, the framework’s claim to unify persistence, stability, and reality alignment would be falsified.

7. Experimental Design

7.1 System Choice

Train a CNN on MNIST or CIFAR-10. Use latent activation manifolds for topological analysis.

Justification: Karuppiah, Nazreen Banu et al. (2026) demonstrate the use of persistent homology on activations to study feature learning and generalization. Turner & Barak (2023) show that RNNs develop attractors sequentially, providing a controlled setting for studying topological evolution during learning.

7.2 Variable Measurement

Variable	Protocol
$D_{T} (x)$ DT(x)	Sample weights; compute distance to final attractor; integrate.
$P_{topo} (t)$ Ptopo(t)	Compute persistent homology on latent activations; sum feature lifetimes.
$E (t)$ E(t)	Finite differences of $P_{topo} (t)$ Ptopo(t).
$κ$ κ	Perturb weights; measure recovery time $τ$ τ; $κ = 1 / τ$ κ=1/τ.
$γ$ γ	Compute average drift rate during training.
$R$ R	Cross-domain generalization accuracy.

7.3 Statistical Analysis

Correlate $E (t)$ E(t) with $κ$ κ and $γ$ γ conditional on regime.
Pre-register thresholds and sample size.

Note on future empirical work: A full empirical validation would require pre-registration with specified sample size, significance thresholds, power analysis, and robustness checks. These are planned for subsequent work.

8. Discussion

8.1 Implications

The paper provides a candidate formalization with defined variables, mathematical properties, and testable predictions. The mathematical properties of $D_{T}$ DT establish its relationship to $κ$ κ and provide a foundation for the framework’s core claims.

8.2 Limitations

$P_{topo}$ Ptopo is computationally expensive.
The framework is a meta-theory, not a complete domain-specific theory.
Variables may be confounded; causal inference requires controlled experiments.
The $κ / γ$ κ/γ regime distinction is proposed and requires empirical validation.

8.3 Future Work

Empirical validation of predictions.
Formal derivation of relationships from first principles.
Extension to other domains.
Computational efficiency improvements.

9. Conclusion

This paper proposes a candidate formalization for the attractor framework. The central mathematical innovation is treating persistence as a functional defined over trajectories— $D_{T} (x) = \int_{0}^{T} d (ϕ_{τ} (x), A) d τ$ DT(x)=∫0Td(ϕτ(x),A)dτ—rather than as a scalar property of states. We defined the cumulative deviation functional $D_{T}$ DT, the topological persistence functional $P_{topo} (t)$ Ptopo(t), and the topological evolution rate $E (t)$ E(t). We proved several mathematical properties of $D_{T}$ DT, including non-negativity, monotonicity, additivity, Lipschitz continuity, and a bound relating $D_{\infty}$ D∞ to $κ$ κ: $D_{\infty} (x) \leq \frac{C}{κ} d (x, A)$ D∞(x)≤κCd(x,A). We established connections to dynamic programming and ergodic theory. We unified the variable set with operational definitions. We derived testable predictions and provided a falsifiable experimental protocol.

The framework now admits formal definitions, operational variables, and empirical tests. The next step is empirical validation.

Appendix A: Possible Extensions from Larose (2025) — Unverified Source

Note: The following source has not been independently verified. It is included for completeness and as a potential direction for future exploration, but should not be treated as established.

Larose (2025) develops a framework for recursive deformation systems. Two constructs are potentially relevant:

Constraint Functional: $C (X) = \int_{trajectory} ∥ \nabla Φ ∥ d τ$ C(X)=∫trajectory∥∇Φ∥dτ, measuring cumulative irreversible deformation.

Persistence Invariant: $I_{p} = \oint R d Φ$ Ip=∮RdΦ, a topological invariant.

These are not yet integrated into the core framework and are presented here for completeness and future exploration. They should be treated as unverified candidate extensions.

References

Arnold, L. (1998). Random Dynamical Systems. Springer.

Berglund, L., et al. (2024). “The Reversal Curse: LLMs Trained on ‘A is B’ Fail to Learn ‘B is A’.” arXiv:2309.12288.

Bowen, R. (1975). Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. Springer.

Carlsson, G. (2009). “Topology and data.” Bulletin of the American Mathematical Society, 46(2), 255-308.

Carlsson, G., & Zomorodian, A. (2009). “The theory of multidimensional persistence.” Discrete & Computational Geometry, 42(1), 71-93.

Clark, D. G., Abbott, L. F., & Litwin-Kumar, A. (2023). “Dimension of activity in random neural networks.” Physical Review Letters, 131, 118401.

Edelsbrunner, H., & Harer, J. (2010). Computational Topology: An Introduction. American Mathematical Society.

Engelken, R., Wolf, F., & Abbott, L. F. (2023). “Lyapunov spectra of chaotic recurrent neural networks.” Physical Review Research, 5, 043044.

Fournier, S. J., & Urbani, P. (2023). “Statistical physics of learning in high-dimensional chaotic systems.” Journal of Statistical Mechanics: Theory and Experiment, 2023(11), 113301.

Karuppiah, K., Nazreen Banu, M., et al. (2026). “Topological Data Analysis (TDA) as a Framework for Understanding Deep Learning Behavior.” 2025 IEEE 5th International Conference on ICT in Business Industry & Government (ICTBIG), Indore, India, December 12-13, 2025. IEEE Xplore. DOI: 10.1109/ICTBIG68706.2025.11323998.

Larose, H. (2025). “A Mathematical Theory of Frame-Independent Persistence.” Academia.edu. [Unverified source.]

Ruelle, D. (1989). Chaotic Evolution and Strange Attractors. Cambridge University Press.

Sompolinsky, H., Crisanti, A., & Sommers, H. J. (1988). “Chaos in Random Neural Networks.” Physical Review Letters, 61(3), 259-262.

Turner, E., & Barak, O. (2023). “The Simplicity Bias in Multi-Task RNNs: Shared Attractors, Reuse of Dynamics, and Geometric Representation.” Advances in Neural Information Processing Systems (NeurIPS).

Suggested citation: Galida, R. S. (2026). The Persistence Functional: A Candidate Formal Foundation for the Attractor Framework (Foundational Edition). Fantasy Attractor.

Excess Entropy Production as a Candidate Universal Cost of Persistence: A Thermodynamic Foundation for the Attractor Framework; Robert Galida (July 2026) [F]

Abstract

1. Introduction

2. The Persistence Cost Functional

3. Existence and Lyapunov Equivalence

4. Entropy Production as Persistence Cost

4.1 Entropy Balance

4.2 Excess Entropy Production

4.3 The Entropy Persistence Functional

4.4 Corrective Permeability

4.5 Basin Depth

5. Domain-Specific Realizations

5.1 Physical Systems: Thermodynamic Excess Entropy

5.2 Biological Systems: Metabolic Excess Entropy

5.3 Cognitive Systems: Free Energy Dissipation

5.4 Social Systems: Coordination Excess Entropy

6. The Unified Framework

6.1 Summary Table

6.2 The Universal Structure

6.3 The Low-Energy Attractor Benchmark (Proposed Hypothesis)

7. Testable Predictions

7.1 Core Prediction

7.2 Secondary Prediction

7.3 Domain-Specific Predictions

8. Experimental Design

8.1 Physical Systems

8.2 Biological Systems

8.3 Cognitive Systems

8.4 Social Systems

9. Open Questions

10. Conclusion

11. Limitations

References

The Persistence Functional: A Candidate Formal Foundation for the Attractor Framework; Robert Galida (July 2026) [F]

Abstract

1. Introduction

1.1 Scope and Status

2. Formal Definitions

2.1 Cumulative Deviation Functional

2.1.1 Why the L¹ Trajectory Functional?

2.2 Topological Persistence Functional

2.3 Topological Evolution Rate

3. Mathematical Properties of the Cumulative Deviation Functional

3.1 Non-negativity

3.2 Monotonicity in TT

3.3 Additivity

3.4 Heuristic Connection: Dynamic Programming

3.5 Lipschitz Continuity with Respect to Initial Conditions

3.6 Instantaneous Growth Rate

3.7 Ergodic Limit

3.8 Bound under Exponential Stability

3.9 Recovery Rate Bound

3.10 Finite Horizon Approximation

3.11 Summary of Properties

4. The Unified Variable Set

4.1 Corrective Permeability (κκ)

4.2 Drift Rate (γγ) — A Proposed Distinction

4.3 Basin Depth (BB) and Persistence Depth (B~B~)

4.4 Reality Alignment (RR)

5. Theoretical Framework

5.1 Relationship Between DTDT​, PtopoPtopo​, and E(t)E(t)

5.2 Relationship Between κκ, γγ, and E(t)E(t)

5.3 Adaptive Landscape (Heuristic Note)

6. Testable Predictions

6.1 Core Prediction

6.2 Secondary Prediction

6.3 Boundary Condition and Global Falsifier

7. Experimental Design

7.1 System Choice

7.2 Variable Measurement

7.3 Statistical Analysis

8. Discussion

8.1 Implications

8.2 Limitations

8.3 Future Work

9. Conclusion

Appendix A: Possible Extensions from Larose (2025) — Unverified Source

References

3.2 Monotonicity in $T$ T

4.1 Corrective Permeability ( $κ$ κ)

4.2 Drift Rate ( $γ$ γ) — A Proposed Distinction

4.3 Basin Depth ( $B$ B) and Persistence Depth ( $\tilde{B}$ B~)

4.4 Reality Alignment ( $R$ R)

5.1 Relationship Between $D_{T}$ DT, $P_{topo}$ Ptopo, and $E (t)$ E(t)

5.2 Relationship Between $κ$ κ, $γ$ γ, and $E (t)$ E(t)