GPT-2 NMC v8.0 - Semantic Expert Discovery (4.1x compression, auto-synonym mapping)
Loading 118.9 MB compressed + computing NMC base weights
(First load takes ~30-60 seconds as weights are reconstructed)
Base weights are computed on-the-fly from coordinates using Collatz dynamics. Only tiny 4-bit residuals (~99% near-zero) are loaded from storage. This achieves 4.1× compression while maintaining full GPT-2 quality.
Train the model to minimize residuals while maintaining language ability. This adjusts weights to be closer to the NMC base.
🔥 Train from Collatz: Initializes NMC base + small random residuals, then trains. ▶ Continue: Keep training from current state (doesn't reset).