Agent-First Engineering
Where teams and agents ship together
The practical core of my diploma thesis: engineering repositories so people and AI agents can build the same product together — context delivered where agents read it, work split without collisions, every handoff and review gate made explicit. Proven on a live multi-agent SaaS build, with a rollout designed for an enterprise energy codebase.
- Role
- Designed & built the practice end-to-end — research → working system → rollout
- Stack
- Claude Code · MCP servers · Agent skills · Context engineering · Multi-agent orchestration · Markdown / Obsidian KB · Linear · Git · contract-first
Some details are generalised. This work runs across both a live product of my own and an enterprise codebase — I describe the practice and its results, not any internal product, system or customer specifics.
The shift
The first wave of AI coding was autocomplete — a model helping one person in one file. The real shift is agents joining the whole loop: planning, pulling context, producing artefacts, validating them, and handing work on. That only pays off if the repository itself is built for it. My diploma thesis started from a scoping review of how agentic AI actually lands in real work systems and reached a practical conclusion: this is a configuration problem before it’s an adoption problem. Drop capable agents into a cold, undocumented repo and they flounder; the leverage is in how the codebase is structured around them.
So the practical work is exactly that — not another model demo, but the engineering practice: how repositories should be structured, how context is delivered, how teams stay aligned, how handoffs and review gates work, and how several people and agents work at once without stepping on each other.
What I’m building
A repository that doubles as an operating system for the people and the agents working in it:
- Context where agents read it — a structured, version-controlled knowledge base (numbered taxonomy, machine-readable frontmatter, cross-links) so an agent starts a task grounded instead of guessing. Product context lives next to the code, not in a separate wiki agents can’t reach.
- Instructions that route by who’s working — a tool-agnostic entry file that loads the right role, conventions and permissions per person and per agent, instead of one generic prompt.
- Skills and MCP servers — reusable, progressively-disclosed skills for recurring work, and live tool access (issue tracker, database, deploy, design) wired in so agents act on the real system, not a description of it.
- Design in the same loop — the same MCP wiring runs bidirectionally to Figma: design context reads straight into code and structure pushes back into the design file, so a designer and I can iterate one shared artefact from both ends instead of losing fidelity across a handoff. I built a multi-step, bilingual onboarding prototype exactly this way.
Coordination, handoffs and gates
The hard part isn’t one agent — it’s many actors not colliding. The model I built makes every boundary explicit:
- Parallel work, no interference — one issue, one branch, one PR, and a defined integration order so contract, backend and frontend changes don’t trample each other.
- Contract-first handoffs — interfaces are agreed and published before the consuming work starts, so async handoffs between people (and agents) don’t drift.
- Review gates — independent agent-on-agent review before any human merge: the producing agent is treated as fallible and a fresh agent checks its work, behind merge-safety gates (green CI, no schema drift, owner approval) and runtime guardrails on anything that touches production.
The throughline is the one the research kept pointing to: context → artefact → validation → review gate → handoff → owner. Authority moves to the gates, and nothing moves forward without a responsible owner.
Proven on a live build
This isn’t theory. It runs my own two-founder SaaS: two people and several different AI tools work the same set of repos every day through exactly this setup — persona-routed instructions, shared skills, wired MCP servers, contract-first handoffs and independent review. The coordination layer is what lets a two-person team move like a bigger one.
The outlook
The forward-looking half of the thesis takes the same practice into a large, multi-service enterprise codebase. The approach is an agent-first restructuring: cross-linking repositories, giving each a README and an agent-context file, and delivering the product side — what each service is for — into the code itself, so the system map becomes something an agent can act on directly. The payoff is identical for an AI agent and a new engineer: both start grounded. This half is now underway.
Status
Actively in progress and the practical centre of my diploma thesis — proven in production on my own product, with the enterprise rollout now underway.
This is the throughline behind everything I build: the leverage today isn’t a single tool, it’s pairing real product thinking and engineering with the imagination to work differently — and the consistency to dig in until the idea actually runs.