When AI Writes the Code — Holiday Coding, March 2026
Companion post for the Holiday Coding session on what changes when AI writes the code. Covers inspiration, practical Claude Code features, context engineering, and exercises on production code.
This post accompanies the Holiday Coding presentation: When AI Writes the Code. It covers the ideas, references, and practical exercises from the session — meant as both a standalone read and a companion to the slides.
What Is Holiday Coding?
Play isn’t the opposite of serious work. Play is how you learn at the edge of your competence.
Holiday Coding is structured play on production systems. You pick a direction, not a destination. You work on real code, but the path is yours.
Why production code? Because it drags ideas out of the laboratory and into the real world. When you apply a new concept to a codebase that matters, you find out in minutes whether it actually improves things — not in theory, but in practice.
It’s not a course — there’s no curriculum, no certification. It’s not a sprint — there’s no backlog, no velocity. It’s not a hack day — the goal isn’t to ship. The goal is to understand. The only deliverable is what you learned.
What Are Holiday Coding Sessions?
A Holiday Coding session is a 2+ hour block built around a single topic. Someone prepares a set of inspiration and hooks, then presents them in a fast-paced 20–40 minute slot. It’s not a polished keynote — it’s a presentation put together by someone with passion for the subject. The energy comes from enthusiasm, not production value.
After the presentation, participants use the remaining time for Holiday Coding: hands-on, practical exploration of the topic on real code.
The presentation is designed to work as a reference. It deliberately contains far more information than you’d normally put in a slide deck. A typical tech talk or keynote is optimized for single-time delivery — you polish every slide because the audience sees it once. Holiday Coding presentations are the opposite. They’re dense with links, ideas, and starting points so participants can come back to them later when they want to dig deeper into something that caught their attention.
This makes them fast to create, too. You don’t spend days on transitions and visual polish. You spend your time collecting the best ideas, the sharpest provocations, and the most useful references. It’s up to the presenter to pick which inspirations to focus on in their 30-minute slot — the deck is always bigger than the talk.
Inspiration and Hooks
These are lessons learned, not laws. Ideas to spark your interest.
AI Coding Is a Magnifier
Not a magic wand. A magnifier. If your codebase is tight, the AI produces more tight software. If your codebase is a mess, the AI accelerates you into a bigger mess.
It’s Never Been Easier to Clean Up
Agentic coding isn’t just for building new features. It’s remarkably good at cleaning house. Lots of people don’t use it that way. They should.
Prune, Don’t Just Refactor
Removing things. File by file, folder by folder. Without it, you get high-speed rot. What used to take months to become unwieldy now takes weeks.
Testing Goes from Checkbox to Survival
You need a test suite that operates on many different levels. Unit tests, integration tests, smoke tests, end-to-end tests. The whole stack. Your rate of change just went through the roof. Crank testing up to twelve.
Dependencies Are a Multiplier of Noise
Two libraries that do similar things? The confusion bleeds into generated code. Pick the one dependency. Let the AI migrate everything. What used to be a tedious multi-sprint cleanup is now an afternoon’s work.
Where Does the Rigor Go?
Engineering quality doesn’t disappear when AI writes code. It migrates — to specs, tests, constraints, and risk management. The work isn’t gone. It moved upstream.
ThoughtWorks Retreat Report (PDF)
Unbundling Code Review
Code review is being unbundled. It was always four functions in a trench coat: mentorship, consistency, correctness, and trust. Each one now needs a new home. Agents handle correctness. Hooks enforce consistency. Mentorship and trust stay human.
ThoughtWorks Retreat Report (PDF)
Cognitive Debt
Technical debt is becoming cognitive debt. The system grows more complex than anyone can hold in their head. And it’s not just you — your users and customers carry that complexity too.
Specs vs. Constraints
Specifications describe what should change. Constraints define what must not be touched. Constraints limit blast radius and let agents work safely across domain boundaries.
When You Go Faster, Look Further Ahead
People treat AI as a speed boost for their existing workflow instead of rethinking what their workflow should be. Speed is the easy part. Staying in control at that speed is the hard part.
The Industry Is Noticing
ThoughtWorks Retreat
Deer Valley, Utah — February 2026. ThoughtWorks hosted the Future of Software Engineering Retreat on the 25th anniversary of the Agile Manifesto. The central question: what happens when AI takes over code production?
| Read the report (PDF) | Martin Fowler’s Reflections |
Supervisory Engineering
A new middle loop emerges. Not writing code. Not release management. Something in between. Directing agents. Evaluating output. Calibrating trust. Encoding standards and defining constraints within which agents can safely operate.
Know Your Customers — Intimately
If every cousin in this world can produce “average software,” it’s your job to deliver great software. You can only do that if you deeply understand what problems you are solving for your customers.
True Agile Practices Are Back
Pair programming, ensemble development, continuous integration. These create the tight feedback loops that agent-assisted development requires. Some teams compressed sprint cadences to one week.
Tips for Code Quality in the Age of Agents
Code Proven to Work
Simon Willison: “Your job is to deliver code you have proven to work.” Untested AI-generated PRs are a dereliction of duty. The job shifts from writing code to proving it works.
| Read the post | Simon Willison’s Blog — follow this man. |
Don’t Ship Slop
“AI Slop” — low-quality, mass-produced, algorithmically generated code that looks polished but lacks substance. Macquarie Dictionary’s 2025 Word of the Year. Bloated functions. Silent logic errors. Hallucinated imports. No tests. Three patterns where one should exist.
The Slop Jar Rule: get caught committing untested, unverified AI-generated code three times and you’re buying the team lunch. Name it. Shame it. Don’t ship it.
Claude Code
Let’s get practical.
MCP — The Universal Tool Belt
Model Context Protocol: an open standard for connecting AI agents to your tools. 10,000+ active servers. 97 million SDK downloads/month. GitHub, Slack, Postgres, Google Drive, Jira, your internal APIs — all accessible through one protocol.
Claude Code + MCP = an agent that can read your tickets, query your database, check your monitoring, and commit the fix. Write your own MCP server in dozens of lines of TypeScript or Python.
| Official docs | MCP Protocol |
CLAUDE.md — Your Project’s Personality File
A persistent configuration file that Claude reads before every conversation. Think .editorconfig for AI — it tells Claude who you are, how you work, and what matters.
Three-layer hierarchy: ~/.claude/CLAUDE.md for personal preferences (your style, your tools), ./CLAUDE.md for project conventions (check this into git!), and ./src/auth/CLAUDE.md for subdirectory overrides (domain-specific rules). Most-specific wins. Loaded automatically. Every line costs context window budget.
Start with ~50 lines. For each line ask: “Would removing this cause Claude to make a mistake?” If not, cut it.
| Anthropic Blog: Using CLAUDE.md Files | Official docs | Writing a Good CLAUDE.md |
CLAUDE.md Forces Clarity
It forces teams to explicitly codify what everyone “just knows.” This might be the most valuable artifact a team creates. More useful than most documentation. You can give your agent a persona. A tone. Constraints. Domain vocabulary. The context IS the prompt.
Skills — Executable Standards
A SKILL.md file with instructions Claude follows when invoked. Not documentation nobody reads. Executable standards.
---
name: "tdd-writer"
description: "Write failing tests first, no implementation"
---
## Instructions here
Skills can spawn subagents, inject live data, restrict tools. From static instructions to programmable agents.
| Official docs | Anthropic Skills Repository |
Hooks — Deterministic Guardrails
Shell commands that execute at specific points in Claude’s workflow. Not suggestions Claude might forget. Rules that always execute: PreToolUse to approve or block before execution, PostToolUse to validate after completion, SessionStart to inject context before Claude sees anything.
Auto-format after every edit. Run tests before commits. Block dangerous patterns.
Modes — Calibrated Autonomy
Cycle with Shift+Tab: Normal (asks permission for everything), Auto-Accept Edits (file edits proceed, risky ops still escalate), and Plan Mode (read-only analysis, no modifications). Additional modes include Bypass (no permission checks, isolated environments only) and Auto Mode (research preview — Claude decides what needs approval based on risk).
Claude Code for Testing
Point Claude at any module. It reads the code, understands the intent, and generates tests. Unit tests, edge cases, error paths — the stuff nobody ever gets around to writing. There is no excuse left for low test coverage. The tedious part is gone.
Got legacy code with zero tests? Claude can bootstrap a full test suite in minutes, not sprints.
Claude Code and the Art of TDD
Claude Code for Documentation
Automated documentation pipelines that keep docs in sync with code. README generation. Changelog automation. API docs that reflect actual endpoints. Mermaid diagrams generated from your actual code structure — architecture docs that stay current.
If documentation requires AI to stay updated, maybe the real problem is the documentation is too coupled to implementation details.
Automated Docs with Claude Code
Claude Code for Security Review
Claude Code Security found 500+ vulnerabilities in production open-source code. Bugs missed for decades by traditional tooling and expert code review. It reasons about data flow and attack chains. Not pattern matching. Multi-stage verification: re-examines its own results, attempts to disprove itself.
Context Engineering
The discipline has shifted from prompt engineering to context engineering.
Context > Prompts
It’s not about crafting the perfect prompt. It’s about giving the agent the right context: CLAUDE.md, skills, hooks, file structure, test suites, constraints. The context IS the prompt.
Steal Like an Artist: System Prompts
Want to learn how the best AI products work? Read their system prompts. Leaked system prompts for Claude, ChatGPT, Gemini, Cursor, Devin, Copilot — they’re all online. They’re masterclasses in context engineering: how to give an AI a persona, how to set constraints, how to structure instructions. Great inspiration for your own CLAUDE.md and skills.
| System Prompts Collection | Awesome System Prompts | Analysis of the Leaks |
LLM as Judge
Use one LLM to evaluate the output of another. “Thinking” models drastically outperform standard models as judges. Use it to evaluate: code quality, test coverage, PR descriptions, documentation completeness. Your test suite is already an LLM judge. Think about what else could be.
The Big Shift
| Before | After |
|---|---|
| Write code | Prove code works |
| Prompt engineering | Context engineering |
| Code review | Supervisory engineering |
| Style guides | Executable skills |
| Best practices docs | CLAUDE.md + Hooks |
| Optional testing | Testing as survival |
| Refactoring | Pruning |
Holiday Wanderings
Pick one. Go deep. Apply it on production code.
For Beginners: Install Claude Code and run it on your project. Write a CLAUDE.md for your repository — what does the team “just know”? Ask Claude to explain your own code back to you. Generate tests for a module that has none.
For the Curious: Try Plan Mode on a refactoring you’ve been putting off. Set up a hook that auto-formats or runs tests after edits. Write a skill that enforces your team’s coding conventions. Use Claude Code to do a security review of a service. Go into plan mode to see what different testing strategies could be for your full stack.
For the Adventurous: Do a full TDD loop — write failing tests, let Claude implement, iterate. Prune: pick a folder, delete what’s dead, let Claude help you find it. Don’t code! Just tell it what to do. File by file. Folder by folder. Standardize a dependency: pick the one library, migrate everything in an afternoon. Agent vs. Agent: have one Claude instance write code, then a second instance review it — LLM-as-judge in practice. Connect an MCP server to a tool you use daily. Make Claude talk to your infrastructure.
The Recipé: Wander and be inspired. Pick a concept. Apply it in full on production code. Power through reality!
References
Inspiration Posts
- AI Coding Is a Magnifier — Sparkboxx
- ThoughtWorks Future of Software Engineering Retreat (PDF)
- Martin Fowler — Fragments, Feb 18
- Simon Willison — Code Proven to Work
- Simon Willison’s Blog — essential reading on AI-assisted programming
Claude Code
- Claude Code Official Docs
- Using CLAUDE.md Files — Anthropic Blog
- Writing a Good CLAUDE.md — HumanLayer
- Claude Code Skills
- Claude Code Hooks
- Claude Code Permissions & Modes
- Claude Code MCP Integration
- Claude Code Security
- Claude Code and the Art of TDD
System Prompts & Context Engineering
- Leaked System Prompts Collection — CL4R1T4S
- Awesome System Prompts — AI Coding Agents
- Analysis: What the Leaks Reveal