Reverse-Engineering Neural Networks | Niles Jewels Rutherford

Interactive Demos

The complete mechanistic interpretability stack. Extract rules, optimize, compile to code.

6 KB tool-calling system with 100% correct grammar. Ready to deploy.

Label neurons automatically. Verify safety constraints. Build interpretable models.

Track activations, analyze attention, perform ablation studies, extract circuits.

Map tokens to experts via synonyms. Deterministic routing inside the model.

The compressed GPT-2 model running in browser with NMC compression.

RULE 1: IF seed == 1 THEN output = "ix."

RULE 2: IF seed == 10 THEN output = ".b"

RULE 3: IF seed == 20 THEN output = "co."

RULE 4: IF seed == 30 THEN output = "ty"

RULE 5: IF seed == 40 THEN output = "ax."

RULE 6: IF seed == 50 THEN output = "py."

"The era of black box AI is ending.
The era of interpretable AI is beginning."

- Niles Jewels Rutherford, January 2026