Llms

All Posts

llms

Published on
April 8, 2026
A Small Model Beat GPT-5 Mini at Tic-Tac-Toe. Here's How.
ai-agents llms enterprise-ai
Stefano Fiorucci trained a small open-source model to outperform GPT-5 Mini at tic-tac-toe using reinforcement learning with verifiable rewards. The key lesson: environment design—reward signals, opponent calibration, batch sizing—determines whether RL training succeeds or collapses.
Published on
March 27, 2026
Karpathy on the Decade Between Demo and Product
enterprise-ai ai-industry edge-ai llms ai-agents
Andrej Karpathy rode in a near-perfect Waymo demo in 2014. It took a decade to become a paid product. That demo-to-product gap—not model architecture—is the binding constraint across self-driving, humanoid robots, and AI-powered education.
Published on
February 24, 2026
Anthropic Looked Inside Claude. Here's What They Found
enterprise-ai llms mlops
Anthropic's interpretability team can now peer inside Claude's internal reasoning and catch it thinking something different from what it writes. For enterprise teams relying on chain-of-thought explanations as evidence, this changes the trust equation entirely.
Published on
December 27, 2025
What Cursor Learned About AI Coding Evals
ai-coding llms ai-agents
Coding evals jumped from single lines to full codebases in four years. Static benchmarks miss the production gap that destroys rollout timelines.
Published on
December 22, 2025
Replit's Bet: AI Agents Without Training Wheels
ai-agents ai-coding llms
Building AI agents for non-technical users requires unsupervised autonomy, not just longer runtimes or better orchestration.

Llms

All Posts

llms

llms (5)

A Small Model Beat GPT-5 Mini at Tic-Tac-Toe. Here's How.

Karpathy on the Decade Between Demo and Product

Anthropic Looked Inside Claude. Here's What They Found

What Cursor Learned About AI Coding Evals

Replit's Bet: AI Agents Without Training Wheels