nullmirror

Security Gating as a Control Problem
2026-02-08 ·10 min read
A practical approach to securing LLM agents by treating tool invocation as a control boundary. Why fail-closed gates, monotonic risk checks, and adversarial benchmarks are more …
GPT-OSS-20B Sampling & Prompting for Style Control
2026-02-01 ·9 min read
Best practices for sampling parameters and system prompt design to optimize style compliance and throughput with local LLMs like GPT-OSS-20B.
Execution Is Cheaper, Comprehension Is Not
2026-01-30 ·7 min read
LLMs have collapsed the cost of execution, but verification, comprehension, and ownership remain the binding constraints in software engineering and adjacent knowledge work.
LLM Fingerprints v1.5: Redistribution with 4chan Data
2026-01-24 ·8 min read
The latest nullbench run shows no step-function gains in base model capability. Instead, fine-tuning, abliteration, and long-context extensions primarily redistribute behavior …
Agentic AI Raises The Floor More Than The Ceiling
2026-01-19 ·9 min read
AI agents make easy software tasks cheaper but don’t automate hard engineering. This post tries to explain why autonomy claims overreach and where AI actually helps.
2025 Learnings on LLMs and Software Work
2025-12-21 ·3 min read
2025 suggests that the gap between expectations and measured outcomes around LLMs in software engineering is no longer subtle. Large investments and confident narratives imply …
Block-Floating FP4 for Local Inference in llama.cpp
2025-12-20 ·7 min read
MXFP4, aka Microscaling Format for 4-bit Floating-Point, is essentially a very small floating-point format that borrows one big trick from signal processing and older “block …
Raspberry Pi Inference: Tiny Quantized Models at the Edge
2025-11-22 ·10 min read
This post reports a small, benchmark run on a Raspberry Pi 4B (8GB RAM) device using llama.cpp across five compact GGUF models spanning ~270M to ~1.2B parameters, with aggressive …
Short Model Horizons Revisited
2025-11-21 ·13 min read
Since our earlier “short horizons, fragile state, orchestration first” note ¹, there have been more data points published and the picture is becoming a bit sharper, …
Tiny Quantized Models On Device
2025-11-15 ·13 min read
On-device LLMs are compact, optimized large language models that run directly on local hardware like smartphones or edge devices, instead of on a remote cloud server. This allows …