Holiday Coding
March 24, 2026
Play isn't the opposite of serious work.
Play is how you learn at the edge of your competence.
Holiday Coding is structured play on production systems.
You pick a direction, not a destination. You work on real code, but the path is yours.
Not a course. There's no curriculum. No certification.
Not a sprint. There's no backlog. No velocity.
Not a hack day. The goal isn't to ship. The goal is to understand.
The only deliverable is what you learned.
These are lessons learned, not laws.
Ideas to spark your interest.
Maybe a provocative statement, or two.
Not a magic wand. A magnifier.
If your codebase is tight, the AI produces more tight software.
If your codebase is a mess, the AI accelerates you into a bigger mess.
Agentic coding isn't just for building new features.
It's remarkably good at cleaning house.
Lots of people don't use it that way. They should.
Removing things. File by file, folder by folder.
Without it, you get high-speed rot.
What used to take months to become unwieldy now takes weeks.
You need a test suite that operates on many different levels.
Unit tests, integration tests, smoke tests, end-to-end tests. The whole stack.
Your rate of change just went through the roof.
Crank testing up to twelve.
Two libraries that do similar things? The confusion bleeds into generated code.
Pick the one dependency. Let the AI migrate everything.
What used to be a tedious multi-sprint cleanup is now an afternoon's work.
Engineering quality doesn't disappear when AI writes code.
It migrates — to specs, tests, constraints, and risk management.
The work isn't gone. It moved upstream.
Code review is being unbundled.
It was always four functions in a trench coat: mentorship, consistency, correctness, and trust.
Each one now needs a new home. Agents handle correctness. Hooks enforce consistency. Mentorship and trust stay human.
Technical debt is becoming cognitive debt.
The system grows more complex than anyone can hold in their head.
And it's not just you — your users and customers carry that complexity too.
Specifications describe what should change.
Constraints define what must not be touched.
Constraints limit blast radius and let agents work safely across domain boundaries.
People treat AI as a speed boost for their existing workflow instead of rethinking what their workflow should be.
Speed is the easy part.
Staying in control at that speed is the hard part.
Deer Valley, Utah — February 2026.
ThoughtWorks hosted the Future of Software Engineering Retreat on the 25th anniversary of the Agile Manifesto.
What happens when AI takes over code production?
A new middle loop emerges.
Not writing code. Not release management. Something in between.
Directing agents. Evaluating output. Calibrating trust.
Encoding standards and defining constraints within which agents can safely operate.
If every cousin in this world can produce "average software," it's your job to deliver great software.
You can only do that if you deeply understand what problems you are solving for your customers.
Pair programming, ensemble development, continuous integration.
These create the tight feedback loops that agent-assisted development requires.
Some teams compressed sprint cadences to one week.
Simon Willison:
Your job is to deliver code you have proven to work.
Untested AI-generated PRs are a dereliction of duty.
The job shifts from writing code to proving it works.
Read the post | Simon Willison's Blog — follow this man.
"AI Slop" — low-quality, mass-produced, algorithmically generated code that looks polished but lacks substance.
The Slop Jar Rule
Get caught committing untested, unverified AI-generated code three times and you're buying the team lunch.
Name it. Shame it. Don't ship it.
Let's get practical.
Model Context Protocol: an open standard for connecting AI agents to your tools.
10,000+ active servers. 97 million SDK downloads/month.
GitHub, Slack, Postgres, Google Drive, Jira, your internal APIs — all accessible through one protocol.
Claude Code + MCP = an agent that can read your tickets, query your database, check your monitoring, and commit the fix.
A persistent configuration file that Claude reads before every conversation.
Think .editorconfig for AI — it tells Claude who you are, how you work, and what matters.
Three-layer hierarchy:
~/.claude/CLAUDE.md — personal preferences (your style, your tools)./CLAUDE.md — project conventions (check this into git!)./src/auth/CLAUDE.md — subdirectory overrides (domain-specific rules)Most-specific wins. Loaded automatically. Every line costs context window budget.
It forces teams to explicitly codify what everyone "just knows."
This might be the most valuable artifact a team creates.
More useful than most documentation.
You can give your agent a persona. A tone. Constraints. Domain vocabulary.
The context IS the prompt.
A SKILL.md file with instructions Claude follows when invoked.
Not documentation nobody reads. Executable standards.
---
name: "tdd-writer"
description: "Write failing tests first, no implementation"
---
## Instructions here
Skills can spawn subagents, inject live data, restrict tools.
From static instructions to programmable agents.
Shell commands that execute at specific points in Claude's workflow.
Not suggestions Claude might forget. Rules that always execute.
PreToolUse — approve or block before executionPostToolUse — validate after completionSessionStart — inject context before Claude sees anythingAuto-format after every edit. Run tests before commits. Block dangerous patterns.
Cycle with Shift+Tab:
Additional modes:
The discipline has shifted from prompt engineering to context engineering.
It's not about crafting the perfect prompt.
It's about giving the agent the right context: CLAUDE.md, skills, hooks, file structure, test suites, constraints.
The context IS the prompt.
Want to learn how the best AI products work? Read their system prompts.
Leaked system prompts for Claude, ChatGPT, Gemini, Cursor, Devin, Copilot — they're all online.
They're masterclasses in context engineering. How to give an AI a persona. How to set constraints. How to structure instructions.
Great inspiration for your own CLAUDE.md and skills.
Use one LLM to evaluate the output of another.
"Thinking" models drastically outperform standard models as judges.
Use it to evaluate: code quality, test coverage, PR descriptions, documentation completeness.
Your test suite is already an LLM judge. Think about what else could be.
| Before | After |
|---|---|
| Write code | Prove code works |
| Prompt engineering | Context engineering |
| Code review | Supervisory engineering |
| Style guides | Executable skills |
| Best practices docs | CLAUDE.md + Hooks |
| Optional testing | Testing as survival |
| Refactoring | Pruning |
Pick one. Go deep. Apply it on production code.
Power through Reality!
60 minutes. Go.
Links to everything mentioned.