AI on Matt Suiche

Legacy Security Is the Real Enterprise AI Bottleneck

Tue, 21 Apr 2026 00:00:00 +0000

High quality data is expensive to collect, clean, and maintain. Poor security makes all of it free. To someone else.

As software collapses toward zero marginal cost, that sentence stops being a cybersecurity truism and starts being a business model observation. Data is the last asset with durable value in an AI-native stack. The only thing that keeps that value is the discipline most AI-native companies are treating as optional.

From La Fontaine to Lego: Characters as Ideological Delivery Systems

Thu, 09 Apr 2026 00:00:00 +0000

Cute characters as ideological delivery systems, and how AI accelerated the propaganda playbook.

Kind of crazy that the big propaganda medium to come out of AI wasn't deepfakes but LEGO men and Persian cats https://t.co/PEzEiamwJg
— Tracy Alloway (@tracyalloway) April 9, 2026

Tracy Alloway nailed it: “Kind of crazy that the big propaganda medium to come out of AI wasn’t deepfakes but LEGO men and Persian cats.”

Everyone was bracing for deepfakes. The national security community spent years warning about synthetic video of world leaders saying things they never said, doctored footage designed to deceive at the pixel level. Instead, what showed up was Lego minifigures of Trump and Netanyahu set to AI-generated rap tracks, produced by an Iran-based group calling themselves the “Explosive News Team”. And it wasn’t just Iran. Chinese state media CCTV joined in with its own GenAI animal fable: “The White Eagle and Persian Cat”, a stop-motion style animation where a White Eagle Alliance dominates trade by forcing other animals to use its currency. Not trying to fool anyone into thinking the footage was real. Just trying to be catchy, shareable, and memetically sticky.

Local Models Within Reach: Everything That Changed in Eight Months

Sun, 05 Apr 2026 00:00:00 +0000

Eight months ago I published Building Agents for Small Language Models, a set of hard-won notes from shipping agents on 270M–32B parameter models. At the time, running useful local models meant embracing constraints: small context windows, CPU-only fallbacks, broken UTF-8 streams, and reasoning that fell apart past two steps.

I stand by that post. But the ground has shifted fast. What was a set of careful workarounds in August 2025 is starting to look like the default architecture for a large class of workloads. Local models are no longer the constrained sibling of cloud APIs — for many agent use cases, they are the better answer. Here is what has changed.

When Machines Pay Machines: The Economics of Agentic AI

Mon, 15 Dec 2025 00:00:00 +0200

The internet was built with a missing piece. In 1994, when the HTTP specification reserved status code 402 for “Payment Required,” the architects knew money would eventually flow as freely as data. Three decades later, that vision is finally materializing—not because humans demanded it, but because AI agents need it.

The 402 Awakening 🔗

HTTP 402 sat dormant for years, a placeholder for a future nobody could quite figure out. Credit cards required human intervention. PayPal needed accounts. Stripe demanded integration. None of these worked for a world where software talks to software at millisecond intervals.

The Hidden Math Bug That Makes AI Unpredictable

Sun, 14 Sep 2025 00:00:00 +0200

This tweet from Awni Hannun demonstrates in one line of MLX code the nondeterminism phenomenon detailed in Thinking Machines’ research. We will explore the PyTorch equivalent that reveals a fundamental issue in AI systems, because I’ve found that tweet extremely helpful to understand what the original blogpost was about.

Here's a one-line code summary in MLX of the @thinkymachines blog post on non-determinism in LLM inference.

I'd guess the difference is larger the lower the precision, as you get larger affects from non-associativity of FP math.

Interestingly, that implies that training at low… pic.twitter.com/jYcDK9GiLn

Building Agents for Small Language Models: A Deep Dive into Lightweight AI

Wed, 27 Aug 2025 00:00:00 +0000

The landscape of AI agents has been dominated by large language models (LLMs) like GPT-4 and Claude, but a new frontier is opening up: lightweight, open-source, locally-deployable agents that can run on consumer hardware. This post shares internal notes and discoveries from my journey building agents for small language models (SLMs) – models ranging from 270M to 32B parameters that run efficiently on CPUs or modest GPUs. These are lessons learned from hands-on experimentation, debugging, and optimizing inference pipelines.