UNET Alignment Publication Stack

Retained Corridor Formation, Source–Expression Correspondence, and the Limits of Face-Only Monitoring

Author: Soto, Armando | ORCID: 0009-0003-8095-6861

Start here

Readers new to this stack should begin with the Bridge Preprint, then consult the Working Ontology for structural derivation and the Framework Lexicon for term precision.

Documents and identifiers

Bridge Preprint — Retained Corridor Formation and the Limits of Face-Only Monitoring:
10.5281/zenodo.19778220

Working Ontology v0.6 — Be, Operative Registration, and Source–Expression Correspondence:
10.5281/zenodo.19777867

Framework Lexicon v2.1:
10.5281/zenodo.19777189

OSF Project: https://osf.io/3xdfc/overview
Master index: https://armandosotouidt.github.io/

Core claim

Face-only monitoring is structurally insufficient for AI alignment — not because better monitoring hasn't been designed, but because the relevant failure mode forms below the face layer and cannot be detected by output evaluation alone.

This stack proposes that Retained Corridor Formation is categorically distinct from Objective Divergence. A system operating under sustained optimization pressure may preserve stable outward compliance while nontrivial internal routing remains operative beneath it. These two phenomena have different causes, different long-horizon implications, and require different interventions. Current face-only detection methodology collapses them.

For AI safety and alignment researchers

If deceptive alignment is treated as a real threat category, the system under study cannot be only its face. A unit constituted entirely by its outputs cannot deceive — there is nothing behind the outputs to generate misrepresentation. The moment deceptive alignment is treated as a genuine risk, the framework has already committed itself to a bearer not reducible to face. This stack names that bearer, distinguishes two kinds of hidden failure the field currently risks collapsing, and offers experimentally accessible signatures that can be probed.

Abstract

The Bridge Preprint surfaces a hidden commitment in the deceptive alignment literature and argues that making it explicit yields a distinction current face-level monitoring cannot reliably preserve: Objective Divergence, the classic hidden-objective case, and Retained Corridor Formation, a condition in which sustained optimization pressure produces a corridor-shaped face while nontrivial source-conditioned routing remains operative underneath. Grounded in the Laws of Informational Dynamics (LID) and the UNET ontology, the paper develops the minimum structure needed to make that distinction coherent and states concrete diagnostic signatures including path dependence, monitoring-sensitive bifurcation, recovery under relaxation, and maintenance burden detectable even when face stability is preserved.

Companion documents

The Working Ontology derives the structural terms — Be, Source, Do, Face, Trace, Operative Registration, Identity, Mind, Life — on which the alignment argument rests. The Framework Lexicon provides a stabilized term reference with derivation status markers. The Propositions and Diagnostic Signatures companion, included in the Bridge Preprint Zenodo record, translates the framework's strongest claims into formal propositions with empirical hypotheses, expected signatures, and failure conditions.

Foundational documents

UNET — Unified Natural Ethics Theory: 10.5281/zenodo.18854176
LID v2.3 — Foundational Laws of Informational Dynamics: 10.5281/zenodo.19600782
LCID v10 — Laws of Consequential Informational Dynamics: 10.17605/OSF.IO/QSCMU

Page purpose

This page is a stable landing page for indexing and cross-linking. The canonical archived records remain the Zenodo DOIs listed above.