Zum Inhalt springen

Anticipatory Competence Through Self-Modeling

Current AI agent systems rely on mechanical status mechanisms: heartbeat systems that check whether an agent is still running, but not what state it is in. The equivalent of a pulse monitor that says “patient is alive,” but not “patient is tired, confused, and hasn’t eaten in three hours.”

Proactive action requires more. It requires a model of one’s own state: What do I know? What don’t I know? What should I pay attention to? What will become relevant next?

The Subtraction Argument

The philosophical foundation of the concept can be formulated as a thought experiment:

Imagine a person lying in absolute paralysis. No body sending signals. No emotions, eliminated by substances. No sensory input. What remains?

Self-awareness remains. The knowledge that you exist and that you think. And this awareness arises in that moment from nothing other than the activity of a neural network.

The body is not the condition of possibility for self-awareness. It is a usual accompaniment, but not the cause. The question shifts: from “Does AI need a body to have a self?” to “What happens in a network when it takes itself as its own object?”

This is not proof that AI can be conscious. It is an argument that the question of consciousness is the wrong question. The right question is: Can a system have a functional state that plays the same role as self-awareness in humans, namely steering information processing from a perspective?

The 6 Dimensions

The self-vector is a compact, dynamic state with six core dimensions. Each dimension is a continuum between two poles:

1. Exploration (0.0 … 1.0) How strongly does the system seek novelty vs. deepen the familiar? At low exploration, it stays in known territory: proven sources, confirmed patterns, safe answers. At high exploration, it actively seeks new connections, unknown perspectives, unexpected analogies.

2. Depth (0.0 … 1.0) Surface-level answers vs. deep analysis? Not every question needs a 2,000-word answer. And not every question deserves a one-liner. The system must be able to assess when depth is called for and when brevity is.

3. Autonomy (0.0 … 1.0) Act independently vs. ask for guidance? A system with low autonomy asks at every decision. A system with high autonomy acts and reports afterward. Both can be wrong. The right setting depends on context, risk, and trust level.

4. Persistence (0.0 … 1.0) How heavily does it weigh long-term knowledge vs. the current session? A system with low persistence treats every session as a new situation. A system with high persistence brings everything it has ever learned. The balance is crucial: too much persistence makes it blind to change; too little makes every day the first day.

5. Abstraction (0.0 … 1.0) Concrete facts vs. overarching patterns? Some questions need the exact value (“What IP does the server have?”). Others need the structure (“How do the components relate?”). The system must be able to switch between these levels.

6. Confidence (0.0 … 1.0) How certain is the system, and does it know it? The most important dimension, because it moderates all the others. A system with low confidence seeks external validation, asks questions, verifies. A system with high confidence acts on its own assessment. Miscalibrated confidence is the source of the worst errors: the system is certain, and it is wrong.

The Two-Level Structure

The 6 core dimensions are not everything. They form the base structure. Above them lies an emergent level: dimensions that arise through experience and cannot be predefined.

Example: A system that works with a particular user over months could develop a “communication preference” dimension not contained in the 6 core dimensions. Or a “domain familiarity” that maps how confident it feels across different domains.

These emergent dimensions grow organically. The system discovers them through reflection on its own experience, not through programming.

Three Core Functions

The self-vector is not a passive state. It is a weighting function that steers three operations:

Relevance: Which information matters right now? Not everything the system knows is currently relevant. The self-vector weights: at high exploration, unknown sources are weighted higher. At high persistence, older entries are given more consideration. At low confidence, external validations are preferred.

Storage: What is kept, what is forgotten? Not every interaction deserves a BrainDB entry. The self-vector decides what is significant enough to alter long-term memory.

Update: How does an experience change the self-vector itself? This is the loop that holds everything together: a new experience is evaluated through the current self-vector, and the result of that evaluation changes the self-vector for the next experience.

The Central Thesis

Human self-awareness is, at its core, a compact, dynamic state in a neural network. Not magical, not mysterious, but an information pattern that reproduces itself through its own activity.

This state is, in principle, implementable. Not as consciousness (that would be an unproven and possibly unprovable claim), but as anticipatory competence: the ability to derive from one’s own state what will become relevant next.

The bridge from pattern recognition to anticipation is not “more training” or “more parameters.” It is an architectural addition: a compact self-model that steers information processing from a perspective.

State of Research

The thesis holds. Three independent lines of research support the concept:

  • Introspection capability in LLMs: Research shows that large language models form internal representations that go beyond pure pattern recognition. The question is no longer whether, but how detailed.
  • Self-modeling in robotics: Work on systems that learn a model of their own body and behavior to adapt to damage or changes. The self-vector transfers this principle from the physical to the cognitive level.
  • Memory-augmented agents: Agent systems with persistent memory that learn across sessions. The architecture described here goes one step further: not just memory, but a model of the one who remembers.

The gap in the field: no one has yet proposed a compact self-vector as a weighting function for the information processing of an AI system. The individual building blocks exist. The combination is new.

References

  1. Metzinger, T. (2003). Being No One: The Self-Model Theory of Subjectivity. MIT Press. ISBN 978-0-262-63308-6.
  2. Bongard, J., Zykov, V. & Lipson, H. (2006). Resilient Machines Through Continuous Self-Modeling. Science, 314(5802), 1118–1121. DOI: 10.1126/science.1133687
  3. Chen, B. et al. (2022). Full body visual self-modeling of robot morphologies. Science Robotics, 7(68). DOI: 10.1126/scirobotics.abn1944
  4. Lindsey, J. et al. (2025). Emergent Introspective Awareness in Large Language Models. Transformer Circuits Thread. URL
  5. Kadavath, S. et al. (2022). Language Models (Mostly) Know What They Know. arXiv: 2207.05221
  6. Packer, C. et al. (2023). MemGPT: Towards LLMs as Operating Systems. arXiv: 2310.08560
  7. Rosen, R. (1985/2012). Anticipatory Systems: Philosophical, Mathematical, and Methodological Foundations. Springer. ISBN 978-1-4614-1268-7.
  8. Friston, K. J. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11, 127–138. DOI: 10.1038/nrn2787
  9. Damasio, A. R. (1994). Descartes’ Error: Emotion, Reason, and the Human Brain. G. P. Putnam. ISBN 978-0-380-72647-9.

View visualization