Memory
The memory surface stores compressed representations of past inputs and retrieves them, conditioned on the current latent. It is the only cognitive surface whose state persists across queries.
Three timescales
Memory operates at three timescales. Each is implemented by a different mechanism.
| Timescale | Span | Mechanism | Capacity |
|---|---|---|---|
| Working | within a query | Attention KV cache | 131,072 tokens |
| Episodic | across queries, within a session | Compressed latent trace | ~5M tokens equivalent |
| Long-term | across sessions, across embodiments | Retrieval-augmented vector store | unbounded |
The three are exposed to other surfaces through a unified read interface. Other surfaces do not choose which timescale to query; the memory surface routes queries to the appropriate store based on a learned policy.
Working memory
The working memory is the KV cache of the active transformer stack. It is reset between queries. Other surfaces do not interact with it directly; it is internal to each surface's forward pass.
It is documented here only to delineate it from the larger timescales: when other surfaces refer to "memory," they mean episodic or long-term.
Episodic memory
Episodic memory is a learned compression of the recent latent stream. Concretely:
- Every
Ntokens of latent (currentlyN = 4,096), a compression module emits a fixed-size summary token sequence (currently 256 tokens). - Summary sequences accumulate in a sliding window of approximately 5 million tokens of effective context.
- Read access is via cross-attention: a query latent attends to the summary window and pulls in the relevant fragments.
The compression module is trained jointly with the rest of the system on a recall-fidelity objective: the original latent must be reconstructible from the summary tokens conditioned on the query.
Episodic memory is associated with a session identity. Sessions are bounded; episodic memory is garbage-collected when a session terminates or when it has been inactive for the configured retention period.
Long-term memory
Long-term memory is a content-addressable vector store. Entries are written explicitly: the memory surface decides, at each query, whether the current latent is worth retaining beyond the session. The decision is policy-learned, not user-controlled.
Entries are keyed by their latent embedding. Retrieval is approximate-nearest-neighbor over the store. The store is sharded across the validator network; see Data Layer for the wire mechanics.
Critical properties:
- Append-only at the protocol level. Validators cannot silently delete entries. Deletion requires a governance-approved action.
- Per-entry provenance. Every entry carries a hash of the validator that wrote it, the query that triggered the write, and the timestamp.
- Right-to-be-forgotten. A user-bound write path includes an optional forget tag. Forget-tagged entries are removable through the data-layer's removal protocol.
The forgetting policy
Memory without forgetting accumulates noise. The memory surface implements forgetting at two levels:
Episodic. Summary tokens that have not been read in the past K queries (currently K = 64) decay: their attention bias is reduced linearly until they fall below a cutoff and are evicted from the sliding window.
Long-term. Entries that have not been retrieved in the past 90 days are demoted to cold storage. Retrieval from cold storage is permitted but incurs higher latency.
Forgetting is not optional. A system that retains everything is operationally equivalent to a system that retains nothing relevant.
Verification
Memory operations are verifiable through:
- Write attestations. Every long-term write is signed by the validator that performed it; signatures are anchored on chain.
- Read sampling. A subset of reads is duplicated across validators; divergence triggers investigation.
- Hash chains. The episodic summary stream is hashed cumulatively; reconstructing the state at a past timestamp requires producing the matching hash chain.
What memory does not do
It does not reason over its contents. Retrieved entries are returned to the calling surface, which performs its own reasoning. It does not arbitrarily edit past entries. Modifications produce new entries linked to the originals.
The memory surface is, in effect, the network's institutional record. Its failures are the failures most likely to be visible to users in the long run; its mechanisms are accordingly the most conservative in the codex.