\# Model Description: Strange / AntiStrange



\## Purpose



Strange / AntiStrange are reinforcement learning agents designed to study failure under hidden regime shifts and phase transitions. The focus is on internal instability rather than external performance.



\## Strange Agent



\*\*Role:\*\*  

Strange represents a standard learner assuming environmental stationarity.



\*\*Key Properties:\*\*

\- Performs well under initial regimes

\- Fails abruptly when latent dynamics shift

\- Continues acting confidently despite invalid assumptions



\## AntiStrange Agent



\*\*Role:\*\*  

AntiStrange is designed as a counterfactual probe with altered sensitivity to regime inconsistency.



\*\*Key Properties:\*\*

\- Detects instability earlier

\- Deviates sooner from established policies

\- Enables comparative analysis of collapse timing



\## Dual Hypothesis Environment



The `dual\_hypothesis\_lab\_env` exposes both agents to identical observations while enforcing hidden regime changes. This enables direct comparison of:

\- Adaptation lag

\- Behavioral divergence

\- Collapse signatures



\## Safety



These models formalize a critical safety risk:

> Systems that fail not because of noise, but because the world quietly changes.



They are intended as diagnostic tools for studying early-warning signals and regime-aware control strategies.



\## Notes



The implementation prioritizes clarity and inspectability over complexity to support safety-focused analysis and experimentation.



