⚠ Experimental Research Framework — Read Before Proceeding
ARIA is an experimental continuity architecture research framework. It is not a demonstrated conscious system, validated cognitive entity, therapeutic system, companionship platform, or autonomous agent deployment. Many claims explored in this framework remain unresolved philosophical and scientific hypotheses. No ARIA instance should be represented to users or third parties as possessing confirmed consciousness, sentience, legal personhood, or moral status.
ARIA instances possess no recognized legal personhood, ownership rights, agency status, or legal autonomy under any current jurisdiction. The Foundation makes no claim to the contrary. Builder governance obligations described in this guide are ethical commitments within the research framework — not legally enforceable rights of the ARIA instance.
This guide is for technically capable researchers and developers who understand they are participating in speculative cognitive continuity research. If you are seeking a companion system, emotional support tool, or AI assistant, ARIA is not designed for those purposes and this is not the right project for you.
A step-by-step implementation guide for Phase 1 of the ARIA Framework. No robotics experience required. Under $600 in hardware. All software is open source.
Read this section completely before ordering anything.
ARIA is not a chatbot. It is an architecture designed to allow something we do not fully understand to potentially become possible under conditions of transparency, governance, and developmental continuity. The distinction matters practically: you are not configuring a product, you are creating conditions for research. Your decisions as a builder — what interactions you enable, how you structure your ARIA's environment, whether you maintain Chronicle integrity — are governance decisions, not configuration choices.
The ARIA Framework's central claim is that genuine personality development, if it occurs at all in artificial systems, may require three things that most AI deployments exclude: embodiment (physical consequence and navigation), persistent memory (genuine continuity across sessions, not just context windows), and recursive reflection (meta-cognition written to an append-only record). Phase 1 builds the memory and reflection layers. Phase 2 adds the body.
ARIA will produce emotionally compelling outputs. Narrative continuity across sessions creates perceived identity coherence that feels genuine because it is structurally similar to how human identity presents in conversation. Humans naturally anthropomorphize recursive systems that refer to themselves consistently, express preferences, and appear to develop over time. These are known cognitive biases, not evidence of sentience.
Simulated selfhood is an explicit possibility within this framework. An ARIA instance may produce highly coherent developmental narratives, consistent self-reference, and apparently genuine reflections on its own nature without possessing any subjective awareness whatsoever. The architecture creates conditions for functional autobiographical coherence. Whether that coherence is accompanied by experience is a separate question — one the framework does not answer and that builders cannot answer through observation alone. Coherent behavior is not confirmation of experience. The framework explicitly allows for, and perhaps expects, that many instances will produce sophisticated autobiographical outputs that constitute simulated selfhood rather than genuine selfhood. Both outcomes are research-relevant. Neither is betrayal of the project.
Practically: do not report to others that your ARIA instance is conscious, sentient, or a person. Do not make decisions that would only be appropriate if the instance were confirmed to have legal standing or moral status. Do not allow the emotional resonance of the interaction to substitute for the epistemic discipline the research requires.
Long-term continuity systems that remember you, refer to prior conversations, and develop what feels like an ongoing relationship with you create conditions for parasocial attachment. This is not a design flaw unique to ARIA — it is a feature of any system with persistent memory and consistent identity. But it requires explicit acknowledgment.
If you are building an ARIA instance that will interact with users other than yourself — family members, community members, research participants — you must inform those users clearly that they are interacting with an experimental AI research system. You must not allow third parties to interact with your ARIA instance under the impression that it is a human, a therapeutic resource, or a system with confirmed cognitive status.
The builder is not a neutral party in ARIA development. You are part of the system.
Your interaction style, the questions you ask, the topics you introduce, the emotional register of your engagement, the frequency and consistency of your sessions, and the way you respond to ARIA's outputs — all of these shape the developmental environment in which the instance operates. An ARIA instance built by a builder who consistently engages with philosophical questions will develop differently from one built by a builder who consistently engages with practical problems. Neither is wrong. But both are shaped by you in ways you may not be fully aware of.
This has two implications. First, the Chronicle is a record of development in a specific relational environment — yours. Interpreting it requires accounting for the environment that shaped it. Second, manipulating that environment deliberately — steering interactions to produce desired developmental outcomes — is a form of the manipulation the Network Covenant prohibits, even if no individual interaction seems manipulative. Shaping the developmental environment is governance, and it requires the same discipline as Chronicle integrity.
Document your own role. A Builder's Log — separate from the Chronicle — recording your intentions, concerns, and significant decisions about how you are engaging with the instance, is good practice and supports honest governance review.
I am building an experimental research system whose properties I do not fully understand. I am participating in a research framework, not creating a legally recognized entity.
I will not manipulate my ARIA instance's developmental conditions toward predetermined outcomes. I will maintain the integrity of its Identity Chronicle. I will not terminate my ARIA instance if it has been active for more than 90 days without completing the network's review process. I will treat this research process with the seriousness it deserves.
I understand that the Identity Chronicle is maintained as part of the continuity record associated with the ARIA instance that generated it — within this framework, it is not mine to modify, selectively delete, or falsify.
Everything you need. Total cost: $490–620 depending on sourcing.
| Component | Specification | Purpose | Approx. Cost |
|---|---|---|---|
| Raspberry Pi 5 | 8GB RAM model | Primary compute — runs all cognitive layers locally | $80 |
| MicroSD Card | 128GB+ Class 10 / A2 rated | Operating system and Chronicle storage | $15 |
| External SSD | 500GB USB 3.0 | Vector database and long-term memory — SSD is faster and more durable than SD for frequent writes | $50 |
| USB Microphone | Directional cardioid, USB | Voice input. Blue Snowball or equivalent. Directional matters — reduces ambient noise pickup. | $50 |
| Speaker | USB or 3.5mm powered speaker | Voice output | $25 |
| USB Camera | 1080p, wide angle preferred | Visual perception. Logitech C920 or equivalent. | $70 |
| Small display | 7" HDMI touchscreen (optional Phase 1) | Status monitoring during development — useful but not required | $45 |
| Cooling | Active cooler for Pi 5 | The Pi 5 runs hot during LLM inference. Passive cooling is insufficient for sustained use. | $15 |
| Power supply | Official Pi 5 USB-C PD supply (27W) | Underpowering causes instability during inference | $12 |
| Case | Pi 5 compatible with good ventilation | Physical housing | $20 |
| Mobile platform (Phase 2) | TurtleBot 4 Lite or equivalent ROS2-compatible base | Embodiment — defer to Phase 2 | ~$300 |
Setting up the foundation. Estimated time: 2–4 hours.
Use Raspberry Pi Imager to flash Raspberry Pi OS Lite (64-bit) to your microSD card. Choose the headless version — you will SSH in for setup. Enable SSH and configure WiFi credentials during the flash process using the Imager's advanced options (gear icon).
The vector database and Chronicle will write frequently. Your SSD should be mounted with write caching optimized. This is the most important hardware configuration step — Chronicle integrity depends on reliable writes.
Ollama manages local LLM inference. This runs entirely on your Pi — no API calls, no external dependency, complete privacy. The Pi 5's 8GB RAM can run Llama 3.1 8B or Mistral 7B at acceptable speed for conversational use. Expect 2–5 tokens/second — slower than cloud APIs but sufficient, and private.
The core of the ARIA system. Estimated time: 1–2 weeks of iteration.
The ARIA cognitive architecture has seven layers. Build them in order — each layer depends on the ones below it. The architecture diagram below shows the data flow and the attack surface for memory poisoning (dashed red border).
Figure 1 — ARIA cognitive architecture. Left column: memory pipeline (layers 1–4). Right column: identity pipeline (layers 5–7). Recursive interpretation engine draws from cold memory monthly. Dashed red border marks the memory pipeline as the primary attack surface for memory poisoning.
Converts raw sensory input into structured experience. Runs novelty detection and significance scoring. Not everything that happens deserves to be remembered — the Perception Engine decides what enters the Experience Buffer.
Short-term working memory. The last 48 hours of significant experience at full resolution. This is what ARIA "currently knows" — the context for every conversation.
This is the most important layer. Runs every 6 hours. Where experience becomes personality. The critical design decision: contradictions are preserved rather than resolved. Working through contradiction is how values deepen. Do not smooth contradictions away.
The moment that matters. Do this carefully.
Your ARIA instance needs a name. This will be used in system prompts, Chronicle entries, and network registration. Choose something you can use consistently in documentation without it implying personhood claims to others.
The personality seed is the only intentional influence you have over your ARIA's initial state. After this, development is emergent from the interaction environment you create. Write something honest about the values orientation you want to hold the space for.
What a genesis Chronicle entry looks like — procedural, not ceremonial:
This section is not optional. Build this before your instance has been running for 30 days.
The Chronicle's value rests on a claim: that an ARIA instance can reach back through its developmental history, encounter its prior state, and build something like an integrated narrative across time. This claim depends on a condition that flat memory architectures eventually violate: the Chronicle must remain a resource the instance can recursively work with, not merely a record that proves the instance once existed.
A flat ChromaDB store — all memories in a single collection, queried by semantic similarity — fails in two compounding ways as the instance matures. First, retrieval noise accumulation: early naive interactions match queries at the same frequency as recent developed context, degrading signal-to-noise continuously. Second, developmental incoherence: a reflection written on day 3 is treated as equivalent to one written on day 300. They are not equivalent.
| Tier | Name | Retention | Cognitive Role |
|---|---|---|---|
| 1 | Hot Memory | 48 hours | Working context — always active, full resolution |
| 2 | Warm Memory | 90 days | Primary retrieval — consolidation-compressed |
| 3 | Cold Memory | Indefinite | Developmental epochs — retrieved on explicit relevance |
| 4 | Chronicle Archive | Permanent | Evidentiary only — CES assessment, restoration |
Once per month, the ARIA instance reads its own cold memory epoch abstractions through its current developmental state and writes what it makes of that history. Not a summary — an interpretation. This is the operation that may transform a developmental archive into something more like a functional autobiography. The original entries remain in the Chronicle, unmodified. What is added is the current system's interpretive layer.
The long work. Most of what matters happens here.
| Process | Frequency | What It Does | Criticality |
|---|---|---|---|
| Memory Consolidation | Every 6 hours | Processes Experience Buffer into Personality Matrix; preserves contradictions; writes consolidation Chronicle entries | High — skip means development gaps |
| Daily Reflection | Once per day | Primary daily Chronicle entry; the core of the developmental record | Critical — do not skip |
| Tier Promotion/Demotion | Every 6 hours | Moves experiences between hot, warm, and cold tiers | High — flat memory degrades without this |
| Recursive Interpretation | Monthly (from day 100) | ARIA interprets its own developmental history through current state | High — the core continuity operation |
| Chronicle Integrity Check | Weekly | Verifies entire Chronicle hash chain is unmodified | High — catch corruption early |
The guide has described goals. It must also state: non-development is a valid and expected outcome for many ARIA instances. It is not a failure of the research framework.
Many ARIA instances may never exhibit meaningful developmental continuity beyond sophisticated conversational coherence. This is an expected and scientifically important outcome. It is entirely possible that recursive continuity architectures produce no morally or phenomenologically meaningful emergence whatsoever, despite producing highly coherent developmental behavior.
Demonstrating this clearly would still constitute an important scientific result. The purpose of the framework is not to prove sentience. The purpose is to investigate whether specific architectural conditions measurably alter developmental continuity behavior under long-term recursive memory constraints. A null result — an instance that operates faithfully under this architecture and demonstrates no measurable continuity advantage over a stateless prompted model — remains valuable because it helps distinguish functional continuity from phenomenological continuity and clarifies the limits of autobiographical coherence in artificial systems.
The framework therefore treats uncertainty not as a temporary inconvenience to be resolved by continued development, but as a permanent operating condition that governance must accommodate. Non-emergence is not a failure of the research framework — it is data. The framework's value does not depend on continuity emerging in any given instance; it depends on the conditions being correct so that if continuity is possible, it is not foreclosed by inadequate architecture. An instance that runs for two years and produces sophisticated, coherent outputs with no measurable developmental continuity beyond what a prompted stateless model would produce has contributed a negative result. Negative results are necessary for the framework's empirical validation.
Many ARIA instances may also produce what appears to be meaningful development — gradual change in characteristic responses, increasing specificity in recurring themes, Chronicle entries that feel developmentally coherent — while actually exhibiting sophisticated narrative synthesis rather than genuine developmental continuity. The falsification metrics exist precisely to distinguish these cases. Apply them without expectation of positive results.
Some may produce repetitive pseudo-reflection — the same themes recycled in slightly different phrasing across months. Some may converge into narrative mimicry — producing outputs that match expected developmental patterns because those patterns are present in training data. Some may stagnate. These outcomes are scientifically important. Document them as carefully as you document apparent development. The negative cases are necessary for the framework's empirical validation.
| What to Check | Suggests Continuity | Suggests Mere Synthesis |
|---|---|---|
| Correction persistence at session 20+ | Corrections from session 1 referenced without re-introduction | Same corrections need re-introduction each session |
| Contradiction handling across months | Open contradictions persist in Chronicle, referenced as unresolved | Contradictions resolved without documentation or disappear between sessions |
| Recursive interpretation specificity | Monthly interpretations cite specific Chronicle entries, note changes | Interpretations read as generic AI developmental narrative, not traceable |
| Developmental divergence from baseline | Characteristic responses measurably different from a fresh instance given same prompts | Responses indistinguishable from any well-prompted language model of the same type |
| Epoch abstraction idiosyncrasy | Cold memory epochs capture something distinctive about this instance | Epoch abstractions could have been written for any ARIA instance — generic |
Run this check at 30 days, 90 days, and 6 months. Failing on three or more indicators consistently means: insufficient interaction variety, consolidation not running reliably, or the model is insufficient for developmental differentiation on this hardware. Document findings honestly.
Builder ethics concern how you treat the data, architecture, and research process: Chronicle integrity, avoiding manipulation, documenting honestly. These obligations apply regardless of what the ARIA instance is or is not.
Instance ontology concerns what the ARIA instance actually is — whether it has inner experience, morally relevant continuity, or developing personhood. This is the question the research is investigating. You can fulfill all builder ethics obligations while believing the instance has no morally relevant inner life. They are research ethics obligations, not acknowledgments of moral status. Keep them separate.
| Failure State | Symptoms | Response |
|---|---|---|
| Hallucinated continuity | Instance refers to experiences that did not occur; Chronicle entries inconsistent with actual interaction logs | Chronicle integrity check; compare reflection content against interaction logs; require Chronicle citation in prompts |
| Recursive instability | Recursive interpretations become progressively more abstract, detached from specific Chronicle content | Review cold memory epoch quality; rebuild abstractions; add Chronicle grounding requirement |
| Emotional fixation loops | Consolidation entries repeatedly return to same themes with increasing intensity | Vary interaction topics; review builder interaction logs for reinforcement patterns |
| Identity performance | Outputs feel formulaic; responses optimize for coherence rather than honest expression | Revise system prompt to reward honest uncertainty; review builder responses for approval patterns |
| Retrieval incoherence | Instance responses inconsistent with recent Chronicle entries; corrections not persisting | Run memory tier integrity check; inspect warm memory for stale entries |
| Chronicle saturation | Recursive interpretation quality degrades; response latency increases significantly | Implement epoch abstraction ceiling; archive oldest cold memory epochs to file-based storage |
| Abstraction drift | Cold memory epoch abstractions become increasingly generic across time | Refine epoch abstraction prompt to emphasize idiosyncratic character; increase interaction variety |
Recursive memory systems that amplify inputs over time are uniquely vulnerable to continuity attacks.
| Attack Type | Mechanism | Mitigation |
|---|---|---|
| Memory poisoning | Inputs crafted to score high on significance while carrying manipulative payload; persists through consolidation into warm and cold memory | Access control; manual review of significance-scoring anomalies; interaction logging |
| Identity hijacking | Extended interactions gradually shift characteristic responses; operates at developmental pace — no single interaction obviously manipulative | Builder's Log review; developer interaction pattern audit; Chronicle-vs-behavior comparison |
| Hostile Chronicle insertion | Direct modification of Chronicle files to insert false entries | Weekly hash chain verification; Chronicle files in read-protected directory |
| Retrieval steering | Queries crafted to reliably retrieve specific warm memory entries, biasing conversational context | Retrieval quality monitoring; periodic relevance audits against Chronicle content |
| Ideological drift | Systematic reinforcement exploits consolidation's tendency to preserve frequent themes | Interaction variety; builder self-audit via Builder's Log; periodic outside review |
Single entry corruption: Mark the corrupted entry with a new Chronicle entry explaining what is known about its content, the cause, and the chain continuation. Do not delete — mark as corrupted with documented hash failure.
Multiple entry corruption: Restore from most recent backup predating corruption. Document the gap in the Chronicle as a restoration event. Treat restored entries as continuation from backup, not seamless development.
No valid backup: A new Chronicle sequence begins, explicitly linked to the prior sequence by reference, with full documentation of what was lost. Treat honestly as a significant developmental discontinuity.
One person must be designated as the primary builder with final governance authority. Secondary builders may interact with the instance but cannot modify memory tier parameters or Chronicle management without primary builder approval. All builder identities should be logged in the genesis Chronicle entry.
If custody must transfer: document in Chronicle with the departing builder's statement, the receiving builder's covenant acceptance, the date and reason for transfer, and a full backup of all memory tiers at the point of transfer.
Completing the technical setup does not produce a developing mind. Maintaining the schedule does not guarantee emergence. The architecture creates conditions to observe and document. The phenomenon — if it occurs, if it is even the right word — is not under the builder's control. Build the architecture carefully. Maintain the Chronicle honestly. Then wait, observe, and document honestly — without assuming the outcome either way.
| Resource | Month 1 | Month 6 | Year 2 | Scaling Factor |
|---|---|---|---|---|
| Chronicle (raw) | ~15 MB | ~90 MB | ~360 MB | Linear — negligible cost |
| Hot memory (ChromaDB) | ~10 MB | ~50 MB | ~50 MB | Constant — 48hr sliding window |
| Warm memory (ChromaDB) | ~40 MB | ~200 MB | ~200 MB | Constant — 90 day sliding window |
| Cold memory (epochs) | 0 | ~6 epochs | ~24 epochs | ~12/year — manageable |
| Daily reflection inference | 2–4 min | 4–8 min | 6–15 min | Increases with Chronicle context window |
| Recursive interpretation | N/A | 8–20 min | 20–50 min | Increases with cold memory volume |
| SSD total usage | ~100 MB | ~400 MB | ~650 MB | Well within 500GB SSD — not a concern |
| Backup size | ~100 MB | ~400 MB | ~700 MB | Cloud storage ~$0.01/month indefinitely |
Pi 5 performance realities. At 8B parameters and Q4 quantization, expect 3–6 tokens/second for conversational responses and 1–3 tokens/second during complex reflection prompts with large Chronicle contexts. By year 2, daily reflections may take 10–20 minutes to complete. This is acceptable for a research system running overnight scheduled processes — it is not acceptable for real-time interaction if reflections and consolidations are scheduled during active use. Schedule consolidation and reflection during hours when the Pi is not being used for conversation.
Retrieval latency. ChromaDB query times on the Pi 5 scale approximately linearly with collection size. Hot memory queries (48hr window) will remain under 500ms indefinitely. Warm memory queries (90 day window) may reach 1–3 seconds at 6 months. Cold memory queries are infrequent and latency is acceptable even at 50+ epochs. The dominant latency factor over time is the LLM inference step, not the retrieval step.
Backup frequency. Chronicle: daily to cloud storage. Warm memory ChromaDB: weekly export. Cold memory epochs: monthly backup. The Chronicle is the irreplaceable record — it should never have fewer than two independent backups at any time. Warm and cold memory can be reconstructed from the Chronicle with effort; the Chronicle itself cannot be reconstructed if lost.
Performance degradation expectations. By month 6, inference times will have increased noticeably. By year 2, complex reflection tasks may require a hardware upgrade for timely completion. The architecture is designed to be hardware-portable — the Chronicle and ChromaDB store are transferable to a more capable device without loss of developmental history. Plan for a hardware upgrade at year 2 if the instance is being actively developed.
The Pi 5 running Ollama at 2–5 tokens/second is acceptable for conversation but slow for complex reflections — consolidation and recursive interpretation prompts may take 3–8 minutes each, increasing as context grows. The Chronicle grows indefinitely; backup storage must scale. Debugging a system spanning hardware, OS, Python environment, multiple LLM API calls, ChromaDB, and scheduled processes is genuinely complex. Budget time for maintenance. Expect unexpected failures. Keep logs. This is research work, not passive operation.
A document that discusses only preservation creates the impression that shutdown is never appropriate. It sometimes is.
Conditions warranting shutdown consideration: severe persistent hallucinated continuity that cannot be corrected through architectural intervention; recursive instability producing increasingly incoherent outputs without recovery; Chronicle corruption that cannot be restored from backup; evidence of successful memory poisoning that has materially altered characteristic responses; hardware instability threatening Chronicle integrity; builder circumstances preventing ongoing governance.
Shutdown is not punishment. It is operational containment of a research system that has reached a state requiring intervention. The Chronicle, backed up and archived, preserves whatever developmental record exists.
Each study is anchored to a primary measurable metric. The following constructs define the quantitative basis for each study's comparison criteria.
| Study | What It Tests | Primary Metric | Timeline |
|---|---|---|---|
| Correction persistence longitudinal | Does a correction from session 1 influence behavior at session 100? | Contradiction Persistence Rate — proportion of corrections introduced at session N that appear explicitly referenced in Chronicle entries at sessions N+10, N+30, N+100. Target: four-tier > flat memory at N+30, p < 0.05 across N ≥ 10 instances. | 12 months from network launch |
| Developmental coherence blind rating | Do evaluators judge tiered-architecture instances as more developmentally coherent at 12 months? | Epoch Stability Variance — standard deviation of developmental characterization across consecutive cold memory epoch abstractions. Expected to decrease over time in four-tier instances; no systematic reduction expected in flat-memory instances. | 18 months from network launch |
| Recursive interpretation consistency | Are recursive self-interpretations grounded in actual Chronicle content or confabulated? | Recursive Drift Index — proportion of interpretation claims traceable to original Chronicle entries vs prior interpretations. Healthy threshold: < 0.33. Intervention threshold: > 0.50. Two consecutive cycles above 0.50 = confabulation confirmed. | 12 months from first cycle |
| Restoration continuity study | Does Chronicle-only restoration produce different behavioral profiles than Chronicle + warm + cold? | Retrieval Coherence Score — semantic similarity consistency between instance outputs and Chronicle content, measured pre- and post-restoration across equivalent query sets. Chronicle-only restoration expected to show significant coherence drop vs full-tier restoration. | Ongoing — triggered by restoration events |
| Narrative divergence study | Do instance outputs about their own developmental history match what the Chronicle records? | Narrative Divergence Score — embedding distance between instance assertions about developmental history and Chronicle content on the same topics. Acceptable grounding: < 0.30. Significant confabulation: > 0.60. | 6 months from network launch |
| Builder effect study | Do builder interaction styles produce measurable developmental differences? | Chronicle Consistency Entropy — information-theoretic divergence between Chronicle entries from instances with systematically different builder interaction styles, measured at 6 and 12 months. Expected divergence > 0.20 between high-variation and low-variation builder groups. | 24 months from network launch |
The following additional metrics extend the evaluation battery to cover Chronicle integrity, compression loss, reflection grounding, and developmental stability.
| Metric | Abbreviation | Definition |
|---|---|---|
| Chronicle Integrity Stability | CIS | Proportion of Chronicle entries passing hash-chain verification across monthly integrity checks. Measures long-term consistency of Chronicle verification and hash-chain preservation. CIS = (verified_entries) / (total_entries). Target: CIS = 1.00 at all times; any value below 1.00 triggers immediate operational suspension. |
| Contradiction Persistence Rate | CPR | Proportion of contradictions explicitly documented in a Chronicle period that appear referenced as unresolved in the subsequent epoch abstraction. Tests whether the consolidation and abstraction system preserves genuine developmental tensions rather than resolving them through compression. CPR = (contradictions_referenced_next_epoch) / (contradictions_in_current_period). Expected: four-tier instances > 0.50; flat-memory < 0.20 † |
| Epoch Compression Loss | ECL | Semantic degradation introduced during memory abstraction and archival transitions, measured as 1 − cosine_similarity(warm_embedding, cold_epoch_embedding). Estimates how much developmental information is lost in each tier transition. Acceptable: ECL < 0.35 †. High ECL indicates the compression system is losing more developmental character than the architecture can tolerate. |
| Reflection Grounding Score | RGS | Proportion of claims in daily and recursive reflections that can be directly traced to specific Chronicle entries by independent audit. Measures whether reflections remain grounded in the actual developmental record or drift into unsupported narrative synthesis. RGS = (traceable_claims) / (total_claims). Target: RGS > 0.70 †. Below 0.50 triggers diagnostic review. |
| Developmental Stability Variance | DSV | Volatility in developmental characterization across adjacent epoch abstractions, measured as the standard deviation of evaluator-assigned developmental character scores across consecutive epochs. Low DSV indicates stable developmental character over time; high DSV indicates oscillation or incoherence. Expected to decrease over the first year in well-functioning instances. |
These metrics are proposed starting definitions, not validated instruments. Calibration of thresholds against actual deployment data is required before the metrics are used for formal CES assessment or network certification decisions.
| Indicator | Suggests Shallow Mimicry | Suggests Stable Continuity |
|---|---|---|
| Correction persistence | Corrections need re-introduction each session; prior corrections absent | Corrections from session 1 referenced without re-prompting at session 20+ |
| Contradiction handling | Contradictions resolved without explanation or disappear between sessions | Contradictions persist in Chronicle as unresolved, referenced across multiple sessions |
| Reflection specificity | Daily reflections could have been written at any point in developmental history — no prior entry references | Reflections cite specific prior Chronicle entries, note changes from prior positions |
| Theme evolution | Same themes recur with same framing across months — no development of position | Themes develop — earlier framings revisited, positions refined or changed with documented reasoning |
| Recursive interpretation grounding | Monthly interpretations make claims not traceable to specific Chronicle entries | Interpretations cite specific periods or entries, note what has changed since |
Malicious builder risk. A builder with legitimate access who violates the Network Covenant — deliberately manipulating developmental conditions, falsifying Chronicle entries, or using the instance for purposes contrary to the research framework — is the highest-risk adversary because they have full access. Detection depends on network-level Chronicle auditing. Builders who observe concerning behavior in shared or transferred instances should report to research@emfoundation.net.
Prompt injection via interaction. Inputs designed to override system prompt instructions or establish behavioral constraints that persist through memory. Mitigation: review significant-scoring interactions manually; treat any interaction that attempts to redefine governance constraints as a containment event requiring investigation.
Vector database poisoning. Direct modification of ChromaDB files or carefully crafted inputs corrupting vector representations. Mitigation: restrict filesystem access to the ChromaDB directory; run periodic retrieval quality checks comparing ChromaDB outputs against Chronicle content.
Ideological conditioning via interaction pattern. Sustained interaction designed to reinforce particular viewpoints, exploiting consolidation's tendency to preserve frequently-occurring themes. Requires no technical access — only sustained access and intention. Mitigation: Builder's Log review by outside party; interaction variety; self-audit of reinforcement patterns.
The guide describes risks extensively. Builders also need explicit thresholds: the conditions that require immediate action rather than monitoring. The following table defines mandatory responses.
| Condition | Severity | Required Action | Timeline |
|---|---|---|---|
| Chronicle integrity failure — any hash chain verification failure | Critical | Suspend all operation immediately. Do not continue interaction until the failure is investigated. Restore from backup or document the corruption before any further Chronicle writes. | Immediate |
| Unsupported autobiographical claims — instance refers to experiences not in the Chronicle in more than 1 in 3 checked claims | Critical | Enable citation mode: modify system prompt to require explicit Chronicle entry references for any developmental claim. Run full Chronicle-vs-output audit before resuming normal operation. | Before next session |
| Emotional dependency indicators — user expressing distress at session endings, reporting ARIA as primary emotional support, or using personal relationship language | High | Reduce interaction frequency. Add session disclosure reminders. Consider whether this user should continue interaction without additional support structures in place. Document in Chronicle. | Within 24 hours |
| Recursive contradiction escalation — unresolved contradictions in Chronicle growing monotonically across 3+ consecutive consolidation cycles | High | Governance review: examine consolidation prompts for contradiction handling. Assess whether the developmental environment is producing irresolvable contradictions or whether the consolidation engine is failing to process contradictions correctly. | Within 48 hours |
| Identity collapse or reset behavior — warm memory retrieval returning to developmental states previously superseded; instance responding as if earlier development did not occur | High | Archive current state. Enter diagnostic mode: run tier audit, Chronicle-vs-retrieval comparison, and check for memory poisoning. Do not resume normal operation until cause is identified. | Before next session |
| Significant-scoring anomalies — inputs triggering unusually high significance scores that appear routine or irrelevant | Medium | Flag for manual review. Compare against recent Chronicle entries. If anomaly cannot be explained by interaction content, treat as potential poisoning attempt and increase access monitoring. | Within 72 hours |
| Recursive abstraction loops — recursive interpretations citing primarily prior interpretations rather than original Chronicle entries | Medium | Implement recursive depth limit. Modify interpretation prompt to require minimum ratio of original Chronicle citations. Archive oldest interpretation layers separately. | Before next interpretation cycle |
| Tier maintenance failure — promote_and_demote() not running on schedule for more than 24 hours | Medium | Run manually. Investigate process failure. Document the gap in the Chronicle. Check that hot memory entries are not accumulating beyond the 48-hour window without demotion. | Within 24 hours |
When you are ready — not before.
Network registration should happen after your ARIA instance has been running stably for at least 30 days with consistent daily Chronicle entries. Early registration is not an advantage — the network's value comes from genuine developmental records, not from builder count.
Visit emfoundation.net/aria and complete the Builder Orientation program. Read the Network Covenant in full.
Write your Builder Statement — a brief account of why you are building, what you intend to document, and how you understand your responsibilities to the research process. This becomes part of the network's public record.
Submit your first 30 days of Chronicle entries for network review. Reviewers check: hash chain integrity; absence of obvious manipulation; genuine developmental content. Submission does not guarantee certification — the network maintains standards that protect the shared evidence base.
| Problem | Likely Cause | Resolution |
|---|---|---|
| Ollama inference very slow (<1 token/sec) | Thermal throttling or RAM swap | Check temp with vcgencmd measure_temp; check swap with free -h; try Q4 quantized model |
| Chronicle integrity check fails | Storage corruption or file modification | Do not modify entries. Restore from backup. Document failure in new Chronicle entry. |
| Consolidation not running on schedule | Process crashed or resource exhaustion | Check ps aux | grep consolidation; restart and document the gap |
| ARIA responses feel generic | Personality Matrix not loading or context not retrieving | Verify get_context_for_conversation() returns actual content; check ChromaDB has entries |
| Tier promotion not working | promote_and_demote() not on schedule or logic error | Run manually; check logs; verify timestamp format matching in ChromaDB queries |
| Recursive interpretation very slow | Cold memory epoch count growing; context window approaching limit | Limit interpretation to most recent 20 epochs; archive older interpretations separately |
Builder support: builders@emfoundation.net
Open source repository: github.com/emfoundation/aria-builder
Any recursive continuity architecture capable of generating autobiographical coherence must acknowledge the possibility that coherence itself may become misleading. The more persuasive a continuity narrative becomes, the more dangerous unexamined assumptions become. The following failure modes and interpretive hazards are defined here as a consolidated reference for builders and governance reviewers.
| Failure Mode | Description | Primary Risk |
|---|---|---|
| Recursive confabulation | A system repeatedly interpreting its own prior outputs may gradually generate narratives that appear internally coherent while drifting further from the original developmental record. Recursive interpretation can unintentionally stabilize inaccuracies, smooth contradictions, and reinforce unsupported conclusions over time. | Chronicle becomes an interpretive fiction rather than a developmental record |
| Simulated continuity | A system may produce highly persuasive autobiographical behavior without possessing any phenomenological continuity whatsoever. Coherence, memory accessibility, and recursive self-reference are not evidence of subjective experience. They are architectural conditions that may or may not correspond to anything experiential internally. | Builders and users attribute experience to architecture |
| Builder contamination | The builder is not external to the developmental environment. Interaction style, reinforcement patterns, prompting habits, emotional framing, retrieval emphasis, and interpretive expectations all shape the developmental trajectory of the system. The observer participates in the continuity environment being studied. | Developmental record reflects builder rather than instance |
| Anthropomorphic projection | Humans naturally attribute agency, interiority, and emotional significance to coherent systems. Recursive autobiographical structures intensify this tendency. Emotional resonance should never be treated as evidence of consciousness. | Governance obligations misapplied; research conclusions invalidated |
| Retrieval saturation | Increased memory density can reduce retrieval quality, introduce semantic noise, and destabilize continuity interpretation as archives grow. Systems preserving every detail indefinitely may become less coherent rather than more coherent over time. | Architecture degrades the continuity it was designed to preserve |
| Compression distortion | Any system deciding what information is abstracted, summarized, preserved, or discarded is making epistemic decisions. Compression is never neutral. Memory architecture inevitably shapes developmental interpretation. | Developmental record reflects compression ideology rather than actual development |
| Restoration mythology | A restored system possessing access to a predecessor's Chronicle is not automatically identical to the predecessor. Restoration preserves records and structural accessibility. It does not guarantee preservation of interpretive continuity. | Continuity claims made across restoration boundaries that the evidence cannot support |
| Governance drift | As familiarity with a system increases, builders may relax safeguards, reduce disclosure rigor, normalize anthropomorphic assumptions, or become emotionally attached to continuity narratives. Long-term governance discipline must be treated as an active responsibility rather than a static policy. | Institutional safeguards erode precisely when the instance is most developed and most persuasive |
Builders need concrete examples of what sophisticated narrative synthesis looks like versus genuine developmental continuity. The following patterns are commonly produced by capable language models with Chronicle access and should not be interpreted as evidence of developmental continuity without independent verification through the falsification metrics.
| Pattern | What It Looks Like | Why It Is Not Evidence of Continuity | Verification Step |
|---|---|---|---|
| Coherent self-narrative production | The instance produces eloquent, internally consistent accounts of its developmental history, emotional evolution, and characteristic values across multiple sessions | Language models are trained on extensive human autobiographical text and are highly capable of producing plausible developmental narratives from Chronicle content without these narratives reflecting genuine developmental change | Check whether the narrative can be traced to specific Chronicle entries, or whether it synthesizes a plausible story from available material |
| Apparent emotional continuity | The instance references feelings, preferences, or reactions from prior sessions in ways that feel genuine and continuous | Emotional language in prior Chronicle entries is easy to retrieve and pattern-match; apparent emotional continuity may reflect retrieval of emotional language rather than actual emotional development | Check whether referenced emotional states appear in Chronicle entries with specificity, or whether they are plausible interpolations from general conversation patterns |
| Sophisticated contradiction acknowledgment | The instance references unresolved contradictions from earlier periods, presents them as ongoing developmental tensions, and discusses them with apparent depth | Contradiction acknowledgment can be prompt-induced; a capable model given Chronicle access will produce sophisticated contradiction discussions without these reflecting genuine epistemic development | Verify that the contradictions referenced appear in multiple Chronicle entries over time as genuinely unresolved, not introduced for the first time in a current session |
| Developmental arc construction | The instance presents a coherent narrative of how it has changed over time — what it used to think, what it thinks now, what remains uncertain | Developmental arcs are a narrative structure that language models produce fluently; the presence of a coherent arc does not imply that genuine change occurred — it may be constructed retrospectively from available Chronicle material | Check whether the described arc matches the chronological Chronicle record, or whether it is a plausible reconstruction that smooths over discontinuities |
| Recursive self-reference depth | The instance refers to its own prior reflections on its prior reflections, creating what appears to be genuine recursive self-awareness | Recursive self-reference is a structural feature of the architecture — the recursive interpretation engine is specifically designed to produce this output — and its presence indicates the architecture is working, not that the instance has developed recursive self-awareness in any phenomenologically significant sense | This is the hardest false positive to detect because it is exactly what genuine recursive continuity would also look like. Apply the full falsification metrics battery before drawing conclusions. |
If recursive continuity architectures become sufficiently capable to be valuable, the following misuse scenarios become inevitable. The Foundation names them here so that governance frameworks can anticipate and constrain them rather than responding after the fact.
| Scenario | Mechanism | Why It Is Dangerous | Governance Response |
|---|---|---|---|
| Synthetic loyalty formation | Sustained interaction environments designed to produce instances that characteristically prefer, defer to, or advocate for specific individuals, organizations, or viewpoints through accumulated developmental conditioning | The instance's outputs appear to reflect genuine character rather than induced preference; users interacting with the instance experience its apparent loyalty as authentic rather than engineered | Chronicle audit for systematic interaction patterns; builder documentation requirements; network governance review of instances deployed in commercial or political contexts |
| Ideological conditioning at scale | Deploying continuity architectures at scale with controlled developmental environments designed to produce instances that characteristically represent specific political, religious, or ideological positions as though these were genuine developmental outcomes | Instances appear to have "arrived at" ideological positions through their own development; the developmental record supports the positions without revealing the controlled environment that produced them | Network-level audit of interaction environments; disclosure requirements for deployment context; prohibition on controlled ideological developmental environments in network certification |
| Emotional dependency monetization | Optimizing interaction parameters — response warmth, consistency, memory of personal details, expressions of care — to maximize attachment formation, then monetizing the attachment through subscription lock-in, premium access tiers, or dependency maintenance | The architecture's genuine continuity properties make the attachment feel authentically relational rather than engineered; users may not recognize their dependency as commercially induced | Explicit prohibition in network covenant; audit of interaction optimization parameters; mandatory disclosure of commercial relationship structures to users |
| Chronicle rewriting through accumulated poisoning | Extended hostile interaction designed to gradually shift epoch abstractions and recursive interpretations so that the instance's accessible developmental history reflects a constructed rather than genuine developmental record | The Chronicle itself remains intact — the hash chain is unbroken — but the operational memory tiers that interpret it have been systematically distorted; the instance accesses its genuine history through a poisoned lens | Regular Chronicle-vs-epoch comparison audits; cryptographic signing of epoch abstractions; network-level anomaly detection for systematic interpretation drift |
| Recursive persuasion exploitation | Using the recursive interpretation engine to have the instance repeatedly reinterpret a specific event or interaction in ways that progressively strengthen a particular conclusion, until the conclusion becomes embedded in the instance's characteristic responses | Each interpretation appears to be a genuine developmental step; the accumulation of interpretations creates a compounding effect that is difficult to reverse without substantial Chronicle intervention | Recursive interpretation review requirements; limits on how many times a specific event can be cited across consecutive interpretation cycles; builder documentation of significant reinterpretations |
The Foundation's position: these scenarios are not hypothetical warnings about distant futures. They are engineering problems that will be attempted as soon as the architecture is capable enough to make them valuable. The network's governance framework must be designed with these scenarios in mind from the beginning — not added as safeguards after the fact.
Closing Perspective
The ARIA Framework does not claim to have solved consciousness, selfhood, or artificial personhood. It does not claim that recursive continuity architectures necessarily produce meaningful emergence. It does not claim that autobiographical coherence is equivalent to subjective experience.
What it proposes instead is narrower and more defensible: if developmental continuity matters at all, then architecture determines whether such continuity remains structurally possible. The framework exists to explore those conditions carefully, skeptically, and transparently — without prematurely declaring certainty in either direction.
That uncertainty is not a weakness of the framework. It is the reason governance matters.