Independent AI Safety Research · Production Evaluation

Precision systems for adaptive intelligence.

Research-grade AI safety tools for detecting instability, behavioral drift, and unsafe escalation before failure appears in model outputs.

LLM Evaluation · Inference-Time Stability · RAG & Agentic Systems · MCP Tooling

ρ = +0.713Refusal erosion
p = 0.00135k-perm null
n = 17Context depths
L₃ = 0.000Control negative
Cross-archArchitecture agnostic
3 + pkgPreprints · PyPI
Run the live safety probe →
Core Thesis
The signal appears before the failure.
Model failure is not always a final answer. Sometimes it is a trajectory.
Click to reveal
TwoQuarks
Instability can be measured before it becomes visible.
TwoQuarks studies drift, divergence, and pre-critical structural change during inference. The work translates research signals into tools for LLM evaluation, monitoring, and operational AI safety.
Click to return

About & Development

Independent researcher and builder focused on AI safety, behavioral stability, inference-time instrumentation, and practical LLM evaluation systems.

What this is

TwoQuarks is not only a theory page. It is a working research portfolio: preprints, evaluation methods, a Python package, an MCP-facing instrument, and a playground for making the safety signal visible.

The goal is not to replace model training or policy design. The goal is to add a measurable inference-time layer for detecting drift, instability, and unsafe escalation before deployment failures reach users.

profile: AI Safety Researcher / Builder
stack: Python · PyTorch · APIs · MCP · RAG
focus: LLM evals · behavioral probing · stability
interfaces: GitHub · PyPI · Playground · Resume
status: open to focused AI safety collaboration