benchmarking

When You Come to a Fork in the Code, Take It
agents

When You Come to a Fork in the Code, Take It

The agent read the skill. Then it read the existing code. Then it wrote with the older pattern. Six runs, both variants, every time. Your codebase is the agent's primary teacher — no skill can override it.

I Read My Agent's Diary
agents

I Read My Agent's Diary

Every agent run leaves a trail. Record it, judge it, find the hotspots, fix them, run again. That's the loop that turns unpredictable agents into reliable ones.